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Second, comparative genomic hybridization (CGH) has revealed the presence of copy 
number increases in tumors, even in chromosomal regions outside of HSRs CGH is a new 
method in which whole chromosome spreads are stained simultaneously with DNA fragments from 
normal cells and from cancer cells, using two different fluorochromes. The images are 
5 computer-processed for the fluorescence ratio, revealing chromosomal regions that have 
undergone amplification or deletion in the cancer cells (Kallioniemi et al. 1992). This method was 
recently applied to 15 breast cancer cell lines (Kallioniemi et al. 1994). DNA sequence copy 
number increases were detected in all 23 chromosome pairs. 

Cloning the genes that undergo duplication in cancer is a formidable challenge. In one 
10 approach, human oncogenes have been identified by hybridizing with probes for other known 
growth-promoting genes, particulariy known oncogenes in other species. For example, the erbB2 
gene was identified using a probe from a chemically induced rat neuroglioblastoma (Siamon et al.). 
Genes with novel sequences and functions will evade this type of search. In another approach, 
genes may be cloned firom an area identified as containing a duplicated region by CGH method. 
Since CGH is able to indicate only the approximate chromosomal region of duplicated genes, an 
extensive amount of experimentation is required to walk through the entire region and identify the 
particular gene involved. 

Genes may also be overexpressed in cancer without being duplicated. IMethods that rely 
on identification from genetic abnormalities necessarily bypass such genes. Increased expression 
can come about through a higher level of transcription of the gene; for example, by up-regulatlon of 
the promoter or substitution with an alternative promoter. It can also occur if the transcription 
product is able to persist longer in the cell; for example, by increasing the resistance to cytoplasmic 
RNase or by redudng the level of such cytoplasmic enzymes. Two examples are the epidermal 
growth factor receptor, overexpressed in 45% of breast cancer tumors (Klijn et al ). and the IGF-1 
receptor, overexpressed in 50-93% of breast cancer tunrwrs (Bems et al.). In almost all cases, the 
overexpression of each of these receptors is by a mechanism other than gene duplication. 

One way of examining overexpression at the messenger RNA level is by subtractive 
hybridization. It involves producing positive and negative cDNA strands from two RNA 
preparations, and looking for cDNA which is not completely hybridized by the opposing preparation 
This is a laborious procedure which has distinct limitations in cancer research. In particular, since 
each subtraction involves cDNA from only two cell populations at a tirne. it is sensitive to Individual 
phenotypic differences due not just to the presence of cancer, but also through natural metabolic 
variations. 

Another way of examining overexpression at the messenger RNA level is by differential 
display (Liang et al. 1992a). In this technk,ue. cDNA is prepared from only a subpopulation of each 
RNA preparation, and expanded via the polymerase chain reaction using primers of particular 
specificity. Similar subpopulations are compared across several RNA preparatfans by gel 
autoradiography for expression differences. In order to sunrey the RNA preparations entirely the 
assay is repeated with a comprehensive set of PGR primers. The screening strategy more 
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effectively includes multiple positive and negative control samples (Sunday et al.). The method has 
recently been applied to breast cancer cell lines, and highlights a number of expression differences 
(Liang et al. 1992b; Chen et al.. McKenzie et al., Watson et al. 1994 & 1996. Kocher et al ) By 
exc^ing the corresponding region of the separating gel. it is possible to recover and sequence the 
cDNA. 

Despite the advancement provided by differential display, pmblems remain in terms of 
applying it In the search for new cancer genes. First, because this is a test for RNA levels any 
Phenotypic difference between cell lines constitute part of the recovered set, leading to a large 
proportion of "false positive" identifications . It has been found that cDNA for mitochondrial genes 
constitute a large proportion of the differentially expressed bands, and it consumes substantial 
resources to recover the sample and obtain a partial sequence in order to eliminate them Second 
false positive identifications are made for reasons attributed to multiple cDNA species and 
competition for the PCR primers by RNA species of different abundance (Debouck) Third 
differential display highlights high copy number mRNAs and shorter mRNAs (Bertioli et al ' 
Yeatman et al.) . and may therefore miss critical cancer-associated transcripts when used as a 
survey technique. Fourth, a number of adjustments are made to gene expression levels when a 
cell undergoes malignant transformation or cultured in vitro. Most of these adjustments are 
secondary, and not part of the tiansfomiation process. Thus, even when a novel sequence is 
obtamd from ti^ diflterential display, it Is far from certain that the corresponding gene is at the root 
20 of the disease process. 

An eariy step in developing gene-specific therapeutic approaches Is the Identification of 
genes that are more central to malignant transfbmiatfon or the persistence of the malignant 
phenotype. 
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It IS an Objective of this invention to provide a method for identifying and characterizing 
genes and gene products which are duplicated or associated with overabundant RNA in cancer 
cells. The method car be used for any type of cancer, providing a plurality of cell populations or 
cell hnes of the type of carK:er are available, in conjunction with a suitable control cell population 
The method is highly effective In identifying genes and gene products ti,at are intimately related to 
malignant transfonnation or maintenance Of the malignant properties of the cancer cells. 

An important derivative of applying the method is the selection and retrieval of cDNA and 
CDNA fragments corresponding to the cancer-associated gene. These fragments can be used 
mteralia to detem,ine tiie nucleotide sequence of the gene and mRNA. the amino acid sequence of 
any encoded protein, or to retrieve from a cDNA or genomic ybiary additional polynucleotides 
related to the gene or its transcripts. Since tiie genes are typically Involved in the malignant 
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process of the cell, the polynucleoUdes. polypeptides, and antibodies derived by using this method 
can .n turn be used to design or screen important diagnostic reagents and therapeutic compounds 

Another objective of (his invention to provide isolated polynucleotides, polypeptides and 
antibodies derived from four novel genes which are associated «.ith several different types of cancer 
5 mcludlng breast cancer. The genes are designated CH1-9a11-2. CH8-2a13-1 CH13-2a12-1 and 
CH14-2a16-1. These designations refer to both strands of the cDNA and fragments thereof and to 
me respective corresponding messenger RNA. including splice variants, allelic varianis and 
fragments of any of these fom«. These genes show RNA overabundance in a majority of cancer cell 
lines tested. A majority of the ceils showing RNA overabundance also have duplication of the 
10 conresponding gene. Another object of this invention is to provide matenals and methods based on 
these polynucleotides, polypeptides, and antibodies for use in the diagnosfe and treatment of cancer 
particulariy breast cancer. ' 

Accordingly, one embodiment of this invention is an isolated polynucleotide comprising a 
linear sequence contained In a polynucleotide selected from the group consisting of CHl-9a11.2 
15 CH8-2a13-1. CH13-2a12-1. a,^ CH14-2a16.1. The linear sequence is contained in a duplicated 
gene or overabundant RNA in cancerous cells. The RNA may be overabundant due to gene 
duplica«on. increased RNA transcripbon or pmcessing. inaeased RNA persistence, any combination 
thereof, or by any other mechanism, in a proportion of breast cancer cells. Preferably the RNA is 
overabundant in at least about 20% of .a rapresentative panei of braast cancer cel. lines.' such as the 

panelslistedherein.morepreferabbr.itisoverabundantlnatleastaboutmofthepanel:^^^ 
preferably, it is overabundant in at teast 60% or mora of the panel. Praferably. the RNA is 
overabundant in at least about 5% of spontaneously occurtng braast cancer tumors; mora preferably 

atirrrr? ''^^^ ' °' ^^^^ 

about20A Of such tumors; mora preferabty. it is overabundant in at teast about 30% of such tumors- 

5 evenmorapraferably.itisoverabundantinatleastabout50%ofsuchtumors 

Praferably. a sequence of at least 10 nucleotides is essentially identical between the isolated 
po^nucleotide of the invention and a cDNAfrom CH1-9a11.2. CH8-2a13-1. CH13-2a12-1. and CH14- 
2^6-1; mora praferably. a sequence of at least about 15 nucleo«des is essentially identical- mora 
praferab,. a sequence Of at least about 20 n^eot^es is ess.«a^ «en«cat more praferab7a 

> sequence of at least about 30 n^eo«es Is essen«a,, identical; p^^. , ^^^^ 3 

^astao.40nu.eo«des is essence,, iden«cauven™,rapraferab,.a^^ 

70 nucleotdes is essen«al, iden«cal; s«l, mora preferably, a sequence of about 100 nu^eo^d^o^ 

mora IS essen«a..y iden^cal. A further embodiment of this inven«on is an isolated po.ynucleo«de 

o«.p,^.nga,inearsequenceessentia.tyiden^ 

Po^^owe wh^ is a DNA po,nucteo«de. an RNA po^nucteotide. a po,nucteo«de 7J^T. 
polynucleotide primer. ■ °^ ® 
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This invention also provides an isolated polypeptide comprising a sequence of amino acids 
essent.al,y identical to the poVpeptide encoded by or translated from a polynucteotide setected from 
the group consisting of CH1-9a11-2. CH8-2al3.1. CH13-2a12-1. and CH14-2a16-1 Preferably a 
sequence of at least atx,ut 5 amino acids is essentially identical between the po.ypep,«le of his 
5 .nvenfon and that encoded by the polynucleotide; more preferably, a sequence of at Lst a J io 
amino acids Is essentialV Jdentical; more preferably, a sequence of at least 15 amino acids is 
essentially ktentical; even more preferably, a sequence of at least 20 amino ackls is essentially 
.dentcal: s«ll more preferably, a sequence of about 30 amino adds or more is essen^ally identical 
Preferably, the polypeptide comprises a linear sequence of at least 15 amino acids essentially 
10 Identical to a sequence encoded by said polynucleotide. Another embodiment of this Invention is a 
polypeptide comprising a linear sequence essentially Identical to a sequence selected f^m the group 
consisting of SEQ. ID NO:17. SEQ. ID NO:20. SEQ. ID NO:25. SEQ. .0 NO;28. SEQ ID No J 
SEQ. ID NO:32, SEQ. ID NO:34; and SEQ. ID NO:37. 

A further embodiment of this invention is an antibody specific for a polypeptide embodied in 
this invention. This encompasses both monoclonal and isolated polyclonal antibodies 

A further embodiment of this invention is a method of using the polynucleotides of this 
mvention lor detecting or measuring gene duplication In cancerous cells, especially but not limited to 
breast cancer cells, comprising the steps of reacting DNA contained in a clinical sample wrth a 
reagent comprising the poh^nudeotide. saW clinical sample having been obtained from an individual 
suspected of having cancerous cells: and compahng the amount of complexes fbm^ between tt^ 
reagent and the DNA in the clinical sample with me amount of compfexes formed between the 
reagent and DNA in a control sample. 

A further embodiment is a mettiod of using the polynucteotides of this Invention for detecting 
or measuring overabundance of RNA in cancerous cells, especially but not limllBd to b^ast cancer 
«lls. comprising the steps of reacting RNA contained in a clinical sampte with a reagent comprising 
the povnucfectide. said clinical sample having been obtained from an indK,Ktua. suspected of havin! 
cance^us cells; and comparing tt,e amount of complexes fbm^ between the reagent and the RNA 

.r. the d.n,cal sampte v«th the amount Of complexes fbm«d between tt,e ^^^^^ 
sample. 

'^»'«^«'*«'*'~nt of tt« invention IS a diagnostic Wt for detecting 
duplication or RNA overabundance in celis contained in an individual as manifest in a clinical sample 
compnsing a reagent and a buffer In suitable paclcaging. wherein the reagent comprises a 
polynucleotide of this invention. 

^"°"^^^'"'^*'"^'°'*'«'"^«"»^«ame«,odofu8ingapolypepti^ 
detecting or measuring specific antibodtes in a clinical sampte. comprising the steps of reacting 
antibixties contained in ti,e clinical sample a reagent comprising the polypeptide, said cHn J 
sampte having been obtained from an indh,idual suspected of having cancerous cells. espedaHy but 
not limited to breast cancer cells; and comparing tt,e amount of comptexes fom,ed between the 
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reagent and the antitK,dies in the clinical sample with the amount of complexes formed l^tween the 
reagent and antibodies in a control sample. 

Another embodiment of this Invention is a method of using an antitx)dy of this inven^on for 
detecting or measuring altered protein expression in a clinical sample, comprising the steps of 
react,ng a po^peptide contained in the clinical sample with a reagent comprising the antibody said 
chnical sample having been obtained from an indMdual suspected of having cancerous ceils 
especially but not limited to breast cancer cells; and comparing the amount of complexes fom,ed 
between the reagent and the polypeptide in the clinical sample with the amount of complexes fomied 
between the reagent and a polypeptide in a control sample. Further embodiments of this invention 

10 are diagnostic kits for detecting or measuring a polypeptide or antibody present in a dinical sample 
comprising a reagent and a buffer in surtable pacKaging. wherein the reagent respectively comprises 
erther an antibody or a polypeptide of this invention. 

Yet another embodiment of this invention is a host cell tr^nsfected by a polynucleotide of 
.nvention. A further embodiment of this invention is a method for using a polynucteotide for screening 

15 a ^rmaceutical candidate, comprising the steps of separating progeny of «.e transfected host ce! 

T Tl """" '""^ ^""^ °' ""^ ^'"^ P^^-ceutica, 

Te el ^ ^ P^^-n^ candidate; and comparing 

thephenotypeofthetreatedoellswiththatoftheuntreatedcells. 

This invention ateo embodies a pham«ceu«cal preparation for use in cancer therapy 
compHsmg a po^nudeotide or polypeptide embodied by this invention, said preparation bein^,' 
capable of reducing the pathofogy of cancerous celte. especially for but not limited to breast cancer 

^"^--bodimentsofthislnventionaremethodsfortreatingan^dMdualbearing^^^ 
«lls. sud, as breast cancer cells, comprising admh^islering any of the aforementioned 
phamiaceuBcal preparations. ">omennonea 

P™P««»», »=m coow. c^h.- 0 ,i^3te. DNA p^pa^t^n, fro. a, tea,, L 1^ 

cancar =e^ „ *a cONA o, «.p o, ^ 
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Step f); and h) further selecting cDNA from the cDNA of step d) corresponding to genes that are 
dupLcated in the cancer cells of step f) relative to the control cells of step e) 

includincT IT '"''^'^ '"^ ^''"''"^"y - the methods of this invention, 

mciuding me following: 

1 cancer cells are preferably used for step b) that share a duplicated gene in the same 
region of a chromosome. If desired, the practitioner may test cancer cells beforehand 
to detect the duplication or deletion of chromosome regions; or cancer cell lines may 
be used that have already been characterized in this respect. 

2. AhigherpiuralityofcancercellsarepreferablyusedtoprovideDNAfbrstepb) stepf) 
or preferably both step b) and step f). The use of three cancer cells is preferred over ' 
two; the use of four cancer cells is more preferred, about five cancer cells is still more 
preferred, about eight cancer cells is even more preferred. The cDNA of each cancer 
cell population is displayed or hybridized separately, in accordance with the method 

3. A higher plurality of control cells are preferably used to provide DNA fbr step a) step 
e). or preferably both step a) and step e). The use of two control cell populations is 
preferred; the use of thrae or more is even more preferred. Both proliferating and non- 
prohferaling populations are preferably used, if available. 

4. The control cells are preferably supplied fresh from a tissue source, and are not 
cultured or transfomied into a cell line. This is increasingly important when the control 
cell populations used in step a) is only one or two in number. Freshly obtained cancer 
cells may also be used as an altemative to cancer cell lines, although this is less 
critical. 

5. An additional screening step is preferably conducted in which the cDNA corresponding 
to the putative cancer-associated gene is additionally hybridized with a digested 
mitochondrial DNA preparation, to eliminate mitochondrial genes. This screening step 
may be conducted befbiB. between, subsequent to. or simultaneously with the other 
screening steps of the method. 

6. An additional screening step is preferably conducted in which RNA is supplied from a 
plurality of cancer cells, and one or preferably more control cell populations; the RNA is 
contacted with cDNA corresponding to the putative cancer^ssodated gene under 
conditions that pem,it fbmation of a stable duplex, and cDNA is selected 
corresponding to RNA that is present in greater abundance in a proportion of the 
cancer cells relative to the control cells. Preferably, the plurality of cancer cells is a 
panel of at least five, preferably at least ten cells. Preferably at least three more 
preferably at least five of the cancer cells show greater abundance of RNA Preferably 
at least one and preferably more of the cancer cells shows a greater abundance of 
RNA compared with control cells, but does not show duplication of the corresponding 
gene in step h) of the method. 
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Other embodiments of the invention are methods for obtaining cDNA corresponding to a 
gene that is deleted or underexpressed in cancer, comprising the steps of a) supplying an RNA 
preparation from control cells; b) supplying RNA preparations from at least two different cancer 
cells that share a deleted gene in the same region of a chromosome: c) displaying cDNA 
coHBsponding to the RNA preparations of step a) and step b) such that different cDNA 
cor^sponding to different RMA in each preparation are displayed separately; and d) selecting 
CDMA corresponding to RNA that is present in lower abundance in the cancer cells of step b) 
relative to the control cells of step a). Such methods typically comprise the following further steps 
e) supplying a digested DNA preparation from control cells; f) supplying digested DNA 
preparations from at least two different cancer cells; g) hybridizing the cDNA of step d) with the 
d,gested DNA preparations of step e) and step f); and h) further selecting cDNA f«,m the cDNA of 
step d) corresponding to a gene that is deleted in the cancer cells of step 0 relative to the control 
ceHs Of step e). Such methods for identifying deleted or underexpressed genes may also comprise 
enhancements such as those described above. 

Addis™, KMKKllm.* Of lh» N„«to„ a™ me»»d, to, charac.eft»« c»cer oenes 

" «l*Wn, m. CDNA may b. u«d » r««„. .ddlto,,, polynuc*o«d« =o™sp<.d,„, u. a JL- 
■MocMted 9enelnmanmRNAp«p,Mon.oracONAorganomlcDNAIIl».ry 

Add«l=™al en*xiiment. of m„ a™ n«,»d, for ,o«.nl,« e.„dM«. d-ug, to, 

cancer comp,isi„g obttirin, cDNA com.,^ to a s«» »at h duplicated 

de,««, or u^r^s^ ,„ „ ^ ^ ^ ^ 

V-lou. .nbodimonB of »is lnven«o„ may be employed in pursuit of an, tonn of c»k« 
~», P«o«-. cane, coico „^ ^ ^ 



3S 



«S»™f »aha»^one,ep™duc«c.of.„.«o«to,„mof.dlt^ 
.«».a.e^cONAco^po„.„,,as,.^of™,«a«^,^,,,^::^„^ 

l^.s»used»aelec.cDNAco™po,,«„,top.,ta..^R^.A.ha,«,o«^ 
^rl* r"*"' °' '•"""^ ^ec«,P^ ONA dige«, tan a 
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5 



10 



«gu« 3 is a half-tone reproduction Of an autoradiogrBm of electroph^^ 

b«ast cancer cell lines probed with a CH8-2a13-1 insert (Panel A) ora loading cont«>l (Panel B). 

F/gu«, 4 is a halftone reproduction of an autoradiogram of electrophoresed DNA digests from a 
panel of breast cancer cell lines probed with a CH13-2al2-l insert 

yrB 5 IS a haw-tone .eproduction of an autcadiogram of electrophoresed total RNA from a panel of 
breast cancer cell lines probed with a CH13-2a12.1 insert. 



Figure 6 Is a map of cDNA fragments obtained for the breast cancer assodated genes CH1-9a1 1 2 
CH8-2a13-l. CH13-2a12-1 and CH14-2aie-1. Regans of the ^gr^nts used to deduce sZ^e 
data hsted ,n the application are indicated by shading. NudeoMe positions are numbered from the 
eft-most residue for which double-strand sequence data has been obtained, which is not necessarily 
15 the 5' terminus of the corresponding message. 

Figun 7 is a listing of primers used for obtaining the cDNA sequence data for CH1-9a1 1-2. 
FIgun 8 is a listing of cDNA sequence obtained Ibr CH1-9a1 1-2 

20 

Figurs B is a listing of the amino acid sequence cor^spondlng to the longest open reading fi^me of 
the DNA sequence of CH1-9a11.2 shown in Figure 8. The singte-letter amino acid code Is used. 
Stop codons are indicated by a dot (.). The upper panel shows the complete amino add translation- 

25 1 T ""'''^ '^'^ '^"-^ transmembrane 

25 region IS indicated by underiining. 

Rgura 10 is a listing of primers used for obtaining the cDNA sequence data for CH8-2a13.l . 
FIgun tl is a listing of cDNA sequence obtained for CH8-2a13-1 

30 

F/g^^^isallstingoftheamlnoaddsequencea^^^ 

the DNA sequence of CH8-2a13-1 shown in Figure 1 1. The upper panel shows the compL amino 
acid translation; the lower panel shows ttie predicted gene product protein sequence. 

35 «ff"'»«isalistingofthenudeotidesequencep,edlcledforafall^engthCH8.2a13.1c^^^ 

TZl' ' '""^ °' """"" con.^, to the tongest open ^ ^ of 

the DNA sequence of CH8-2a13-1 shown in Figure 13. 



-10- 



wo 97O8085 



PCT/US97/0S930 



F/gun IS is a listing of primers used for obtaining the cDNA sequence data for CH13.2a12.1 
Figure 16 is a listing of cDNA sequence obtained for CH13-2a12-1 

5 

FlgufB 17 is a listing of the amino acid sequence corresponding to the longest open reading frame of 
the DNA sequence of CH13.2a12-1 shown in F^ure 16. The upper panel shows the complete amino 
acid translation: the lower panel shows the predicted gene product protein sequence. 

10 Figure f0 is a listing of primers used for obtaining cDNA sequence data for CH13-2a12-1.. 

WaruTB 19 is a listing of the cDNA sequence data obtained by two-directional sequencing for CH14- 



15 



Figure 20 Is a listing of the amino acid sequence corresponding to the longest open reading frame of 
the DNAsequence of CH14-2a16-1 shown in Figure 19. The upper panel shows the complete amino 
aad translatfon: the lower panel shows the predicted gene p««luct protein sequence Residues 
corresponding to three zinc finger motifs are underlined. Indicating that the protein may have DNA or 
RNA binding activity. 



20 



Figure 21 is a listing of additional DNA sequent data towards the 5' end of CH14-2a16-1 obtained 
by one-directionai sequencing of Uie fragment pCH14-1.3. two panels show nucleotide and 
amino add sequence f„,m the 5' end of the fragment; the second two panels show nucleotide and 
am.no add sequence from ti,e 3' end of the ftagmeni Region, of overlap with pCH14-«,0 are 
^0 undenined. 

30, J5 fc, . Win, « «^ ^ ^ ^ 

«^ « IS a «„, 0, «e amino «,,«„ce cc«,x»rt,„g b .h. »^ 
JO stop codons are indicated by a dot (•). 

l^fB 2S is a listing of additional cDNA sequence obtained for CH14-2a16-1 c^ri^ 
appr«dmateV1934basepans5.,^mthesequenceofFlgure19. ^ 
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Figu^'J - - "^«ng Of the amino acid sequence corresponding to the tongest open reading ^rr^ of 
he DNA sequence of CH1-9a1 1-2 shown in Figure 25. The sing,e-,ener an^no a.d code is Z 
StopcodonsareindicatedbyadotC). The upper panel shows the compete arr^no acid .,.ns.a«on 
5 the lower panel shows zthe predicted gene product protein sequence. 

BEST MODF FOR Carryim« n^^j jHe Imvpmti qm 

This invention relates to the discovery and characterization of four novel genes associated 
10 .thb^tcancer. The c^^.A of these genes, and. eir sequences as disposed Iw. p^ 
basis of a series of reagents that can be used in diagnosis and therapy 

.n40^0o/oofthece.lstested. Surpns.g,. each of .he four genes was du^lcated in at least one cIl! 
^ne Where studies using comparative genon^c hybridiza«on had not revealed any ampfifica«on of the 
1 1> corresponding chroniosomal region. 

Levels of expression at the mRNA level were tested in a similar panel for two of these four 
genes^ In addition to those cell lines showing gene duplication. 17 to Zl% of the lines showed RNA 
overabundance v^thout gene duplication, indicating that the malignant cells had used some 
mechanism otter man gene <iup!ication to promote the abundance of RNA corresponding to these 

20 genes. ^"^^ of the breast cancer genes have open reading francs, and likeV are transcribed at 
v^us teveu. in dHferent cell types, overabundance Of the corres^^^^^^ 
Lkely assodated with oveiexpression of the protein gene product Such overexpression may be 
manrfest as increased secretion of the protein f^ the cell into blood or the surrounding environment 
an.ncreaseddens«yoftheproteinat,hec.lsurf.^ 

.5 the cell. ,n companson to the typical level in noncancerous cells of the same tissue type 

Different tumors bear different genotypes and phenotypes. even when derived from the same 

suppo^ng the malignancy of the cancer. This invention discloses genes that achteve RNA 
^^^ndance by seveiB. mechan.ms. because they are more liKeV to be directiy invoked in «« 
D pathogenic process, and therefore suteble targets for phamiacc^ogical manipulation 

Features of the four novel genes, the respective mRNA. and the cDNA used to find them are 
Pfovided In Table 1. 
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TABLE 1: Chanictertetics of 4 Novel Breast Cancer Genes 



Chromosome 


Designation 


mRNA 
Obseived 


Exemplary cDNA 
Fragments Cloned 


1 ^ 


CH1-9a11-2 


5.5kb. 4.5kb 


1.1 kb, 2.5 kb 




CH8-2a13.1 


4.2kb 


0.6 kb (two). 3.0 kb 
4.0kb . ' 


13 


CH13-2a12-1 


3.5kb. 3.2kb 


1.6kb,3.5kb 


14 


CH14-2a16-1 


3.8kb. 3kb 


0.8kb, 1.3kb.1.6kb.2.5 
kb 



AH four genes sequences are unrelated to other genes known to be overexpressed in breast 
cancer. .nCuding the e/*B2 gene (Adnane et al.). tissue factor (Chen et al.), mammaglobulin (Watson 
et al.), and DD96 (Kocher et al.). 

The four mRNA sequences each comprise an open reading frame. The CH1.9a11-2 gene is 
expressed at the mRNA teve. at relatN^ely elevated levels in pancreas and testis. The CH8-2al3.1 
gene is expressed at relatively elevated levels in adult heart, spleen, thymus, small intestine cc^on 
and t«sues of the reproductive system: and at higher levels in certain tissues of the fetus The CH13- 
2312-1 gene is expressed at relatively elevated leves in heart skeletal muscle, and testis. The CH14- 
2316-1 gene is expressed at relatively elevated levels in testis. The level of expression of aH four 
genes .s especially high in a substantial proportion of breast cancer cell lines 

expressed as a surface protein on cancer cells. The CH13-2a12-1 gene is distantiy related to a C 
elegans gene implicated in cell cycte regulation, and may play a role in the regu^tion of cell 
proi-feration. The protein encoded by CH13-2a12-1 is distanfly related to a vasopressin-activated 
ca.ca.m b.nding receptor, and may have Ca^ binding activity. The CH14-2a16-1 comprises at least 
fived™azincfinger..ndi^^ The 

CH14-2a16.l gene product is suspected Of having DNA or RNA binding actlv^^ 
role in cancer pathogenesis. 

cane 22r" « Of 3.n« »^ ^ ,„ 

"«y»*.rl»«hDNA«„p««».„,^.«e«,RNAabu™^ ^ 
ab«»ma. gene «sula*>a I, cents* « «e ™ii,r«^ 
brought to bear on any type of cancer. 

^««^'««"'"g'~«hodlssuperiortoanypreviouslyavailabteapproachinsev^^ 
Particularly significant is that screening is rapidly focused towards genes that are central to «,e 
malrgnant process, and away from those that have variable tevels of expresston as part of nom,al 
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5 

Dennmons 

Terms used in this application include the following' 
-.on*„l«,« exampies o, polyn»aeo6d«: a gene or «„e te,™„, „ ^ 

in the contM of povn^aeotidea. a "tea, aeqaence" o, . s,,^ „ „ J! , 

»«.~««.,pov^«.,„,s,o3.o,n«.^.««,^^^:*::' 

«. »e . «» pn^a^ a^ „, „e po^n^eo^a. A "pa 

™*™ >»««on in wWch one or more poi,n«teoW^ 

y..^».,eac.on™,con«M..«.p..„^.«^;;^^ 

PCR, or ma enzymate Cleavage 01 a polynucleoM, b, a llbozym. 

"''««'^°"'»««onscanPepartbnnadon<^oo«lldon.ofd^ 

at inr:'' a,. .» „^ ^ 

aiB ««. condlbona. a-c aa hgher lanw«,« and lower aodium ion concanlMon wni* 

Nn^nun. ^ e^ ^ a a». hy^^aC^T 
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published in me art: see. for exan^pte. "Molecuter Ctoning: A laboratory Manual". Second Edition 
(Sanr»brook.Fritsch&Maniatis,1989). camon 
When hybridizaton occurs in an antparalle. configura«on between two single-stranded 

5 TTT' ^ '"^^"^ "com^ementary. A double-stranded 

5 polynucleotKle can be -complementary" to another po^nuCeotKie. if hybrid^ation can occur between 
one Of the strands of the first polynudeotWe and the second. Complementarity (the degree that one 
polynucleotide is complementary with another) is quantifiable in tem,s of the proportion of bases in 
opposing strands that are expected to torn, hydrogen bonding ^th each other.ac^^^^^ 

10 * ""^ ^^'"'^ "'«*<«to Is -ktotor to anotto IhM, ,«,u,„,^ „ ^ ^ 

. eac. ™ „ sa™. an. occa™ ^ Zl „ 

t. a<,u.ate„, ,„ ^ „, ^^^^^ ^ ^ i„ter^«.„ „, Z 

t«es. pa«a,.^ ^ ^ „ ^ ^ 

^^^-"'"-".an^.a^sa^^. M and a ONA po,™o.^.t^ 

^auancaa a™ capable o, to fam, a *p,« w«, comp,aa««^^u««r 

SSC. o, aubou, «0C l„ 0.5 X S8C. a, « .bou 300C ^ 6 X SSC con-^^ 50% J 
™.«.^.»byb««a,«0Ca,hW««2xSSCor,^»,.ipJ3^'^ 

2^ ^ povai^as sb.« ba tostoa a„ae, b,b. ^J^":;' 
l»»n».co™p.«b^™.p.^. G«,a«V..«an.a», Itasca, s«,„an»a,™ a, »a«ab=a, 
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taiget pol^nucleotkle, and whefter «» polynuOeoMe encodes », ld»«,^ 

«^ are p^fe^d over nucWde Sd.s««.ns «a. c«a.e a «.p c«,onl^ 

P°*'"'«*«i*.i«%»P«de.o,anli^ 

» <«enn«d by addh, a a,k»*, ^ ^ ^ 3^'^,^^ 

™-.d-«=^pnx^ ■«'0'*^-a.««n,a-«^™,rrrc!^ 

•<«W-or-co™pteMm,,,DNA-|.a«n9le-o.doubMrandadOW 
^«co,npten»n,an,tean«*e™„,.„^RNAn«c*. A "cDNA ».,„«r ^ ^ 

.»man«h„. ^ « ««np,.. „««„^ „^ ^ ^ ^ ^ 
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-eacta, or b, chenfcal synthesis of . DNA based upon kno»fedse C lb. RNA se„«oe oDNA 
aso M^sponds to the sene tba. encodes tb, RNA. PotyoudeoHdes .na, be said to conespond 
evenwbenoneofthepalrisderivedllomonlyaportlonoltbeolber. 

A -prabe- When used In tb. oontext of potynucleond. msnlpulatto, «fe« ,0 a pCynuaeoUde 
--^.^ ^ - = a ta^t p^en..^ p^nt ,n a sa... 7.,e,«tt 

>m ».». «. ia,9«. Ustia*,. a pmbe «, con^ a ,ab« 0, a means b, ^ a teb.1 can b. 
attached, e«herbete»ors,toe,,de«,oth.hybrtdfe^ Sellable labdsind„d..bd, a,, not 

t™ted to cadiotsotopes. lluo™ch™„»s, chen*mi„«ceht «»,,pod™is. dyes, and en^mes 
10 ooten.,,,^! ''^^""^ PolynudeoMe, ««» a l« 3' -OH ,™,p, tba. binds to a ta^ 

0, a po^^c^ ^^„«, to .a,^ A -p^yn^as, chainlet^ 
^R).sa,ea«.,n«b,cb,e^»at.co^a«nu«.o.a.^p^„C«,^„^^^„^ 
pmn.n>, and a cata^^t of po,,n„nza.«,, such as a n«,«. ttanscdptas. or a DNA po!n««3 
Pa^^^^.thenha^stabfep^n^^seen^™. Mothods tor PC ^ taughtt utZT^ 

sa™p«^c«.eo«e.suchasPCRo,ge«c^„l„,,3™c^,*^,^^^^„,^.,;^^;^ 

5 «^Z ."''^'^*"'^''»'»™'~"'«'P"*i"a™''unct«»>a,l,«,ated 

irnl '^"^""^'^'" — ^"edtop.o™,.,!,^, 

'"^7'^'*«-"«»«»9ta.tansa*to,,^aa,,«^ 
!0 «ns,at«», „„„b.n s«es, p^tatn ««odN ^ ^'"^ 

d^?r T """^ °" bcaed dcwnebe™ f„ ^ ^ 

dire*") i™, the p™™*,. 'Ope^b., an,«- ,0 a )«.posl*„ o, .enetic e^ 
*^»-«''«=«-«PP''-*.*a..oop.,3t..^eexpec«^^ P„ 

this functional laatlonsbip Is malntahed. <w>nsoionsas 

'f™ *"'^'««<'t»*nBd«cdb. the pracesswhe^y an inc^^^ 

r*"«7"« = P-«-a'»«o,a^„»,,W,,p««.^apa^ 
) G«"aampl,flca«oh-g«^|,,,,„o,^^g^,,^^ 

•Exp«^-|sd«k««B„«,4,i,^^ 
^. an RNA polynuCodde, . as ^ .^.J^* a 

nr°: *"^''*"''"^'°'-»™«^-9""«y«^*.lh.p,0d„cn^ 
«»««*^^sp«*dc,e*,tedo^e. Th..-RNAo«,»^.«^ 

"«« RNA (as a p»portio„ of Ma. R»«, fion, a pa^cuiarsene in a c«, bains de«*ed sj^! 

^ ov-^xp^salon- p,es««e o, «^ ^,^1^ . 
Pra*«w<by,lbr«ampHacanc«ogsc«l P piasent in or 
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"Abundance" of RNA refers to the amount of a oartieular rwa . 

In the context of polypeptides, a "linear sequence" or a "sequence" is an o«i^ of • 

«<«"ces IBM a subaanlial degree of seqiiaw Menllly. It is undereloM »,.. if^ , . . 

■>»« «s,l, ictoaed. Fo, example, subatuta of „, a™™ edd wUh ft,a„pM*: Me ctom. 
3-c„»«*tecW„».po*.^c.ei,.,s^<*.„s.*3po,««o,„eg,Jc.Cro,^tl,s 
occur w»ho« daurtNn, ll» 1*^ „ ^ ^ til 

-«^^«d,c.,^^^,^„3:2.r:^.:rr^ 

„ .e ee™ poeKio™: pc^on^ 

" « "-".^ ««« » a. j:^ 

^.d«-tol.a«comp™ea.,eas.abou.a«,»,5%wl,k*.,e«l.»,«.«,^ 

P^. .»y «.npdse a. teas, abo^ 70% IdenUca, ^eiduee c oo.^ ^ 
P»*™b,, 1,^ o«„p^ a, ^ 3^ „ ^^^^^^^ ™. 
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y compose a, Isas. ^ „ ^ 

ixet^ab^^ .hey co^ a, leas, abou, 95% i*„,ca, „ con,e™«ve «*s«*ns. 

■nomprrtSraUy. Iheyconlain 100% idenfcal residues. 

In dWerminlns >*e«»r po^pepMe se„»™^s are essantally identa,, a sequence «,at 
= presses .he ™^ o,.h. pCpep^e ^ .h.h . . heln. compared is pa,.. J pre'rr!, 

crystallographic Structure. ^ 

10 =^ '■''^'S «o a Brget such as a p<<ypepMe. «,«,gh a teas. ^ 

loca.ed in »e «r*,e re^K^, Of Ih. ,m,«,n=,«H«n ^ Z hZ^T, 

e^sses no. on, i„^, an«^^. ^ ^ ^ ^ ^ 

3n«bod.s. and any «h.r n„difed con«,„„«on o. *e n^TZ 
<=onipnsesanan«genrecognffions,teof««m<,ulrMspecllic«y moteute ma. 

" *n.n. 7 mceo-e ma, Is specScal, hound by an anybody 

.hrough «s anlsen ,eco^ s«.. an^gen may. hu. need no, be chenlcaU, 
"™«=!.»«a.s*.««,p,o*cto,«^„^, Theanngenn^ybep^alen! orttl^ 

^0 *en^, ^ I. an capat* 0, «^ p^^'^ » ar^w^ 

|^.n»,as„^bos.usua,Va™„„«. <^ur-. ™y b. 

^hn^ues Known In ,ne ^ coss^ „ co*^, ^.c«^t.Z^ ^ 

S *"'"''"'^"''''*a'™«'*='IP^»'=ll»"'orhun«no,anima,usewhlchlsusi^ 
-m me ^».«o„ 0, e«*, a spe* ^ ^ .esporCt 

hum«,lo.ce.*,.sy«enWcor.ec«.,. 1^ ^nn,™ response n»y be desL JI^l:! 

~ ^me.,e-,„en,o,a p-*,^c«»on. fcrm,*.n,.„ofa par^sobZrr,^ 
l»°l«Wto«*asalnstapa«ioularcondlllonora*sBnca. "sraubsBnce, orlbf 

^^n. Where m. subs^nc. 0, a s^ ^ „ , 

.^-naaoorcr^dure. Enr.ohn»n, can be n«su,ed on an ab«*,e basis. ««h « wej^ 
volume of solution, or it can be measuied in . . '•"cn as weight per 

n~«„»- «. '^^^'^ *° 3 second, potentially interfering 8ut)s^^ 

present ,n the source nuxture. Increasing enrichments of the embodiments of this 

more preferred. 100.fbW enrichment is more preferred. I000-fo« enrichment is even more 
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preferred. A substar,ce can also be provided in an isobted state by a p«,cess of artifi.- , 
such as by Chemical synthesis or recon,binant expression 

A polynucleotide used in a reaction, such as a probe used in a hybridisation reaction • 
used ,n a PCR. or a po.ynuc.eo«de present in a phannaceutical pr^parat^ T ' 

5 --7---V.-esorreactswiththeintendedtar.et:~^^^^^ 
greater duration than it does wfth altemath/e substances Similarlv an JlT^ 

•specific- or "selective" if it binds via at teast on. J ' " '^"^'^^ ^° 

«^ . ^"''S®" recognition site to the intended 

o/-spedtan,dai«d„3-as„l^i,,„^„^_^'^^ *■ ai-l^ody » capable 
-»»e,.„^o.^a,^.„.^^lt:^;;7«»"»'P=--'ce».p, 

A -ptiarmaceutcal candidal.- or -drag candMata- fe a conipound 

olhar *an by mto» a- ™»»». Tba ™y ba he«»clog«s to cal 0,1^ J2 . 

Pc«.n^.o,Ma «^ an, procaa, known ,„ « 3^ 
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»«. a DNA or RNA v^s 0, v« «ao,. T*. „ ^ ^ 

b/ progeny of »ie altered cell. "™rnaiMe 

A -host cer IS a cell «l,lc^ has been genetica., altered, or » capable of being genetlcall, 
altered, by administrallon of an exogenous polynucleolide. 

' c-s thJT ""^ " ''^ <^ lingular or plural Ibnn, refer to 

=* that unde-BOn. a n«,na« tmnshnnatton ^, „^ .nen, pathotogica. lo the nost 

7--«-*°«-"*.or...n«.«.con*.„«„,,^.^'„ 
.0 freest, radiation, ^sion o«er .e«. «^ „ «^ „ .»c^ 

^n^«a^n,n..,a„,.™»^™3yo«,t„v^„.^,,«^„^^,^^ 

'*'^'««l"«»P*«09y associated »i», a particuter cancer c«l „o»„I^ 

i™»lbn™*on) can be „«% dfclingoished Ircm non.:ance,cus cells b, well,*stablisl»i , 
P.«=-.,.«..0,.CI...™^. T..**.«.ofacancercell,asZ::„^r^rn:r. 
.^cncerC,^tan,Cd.n«,tan.c^^3^, ^ incudes n-ZH 

«noercefcandin,ltn.cu«u,.,andert.lln..de>l.edlh»„c««rcells 
» V^''"'"*^ '=««<' 'V'c.Kerc-lwltbh.f^, is an^ 

wearing or nonnal physio^ ofthehostTMsn^*^^,^^ tdorn^, „ 
uncontr^a^e growfb o, «e ce.. n^^. cy.,**,.. „ ^ ^ ^^^'^ 

w* the nonnal fondon of neighboring ceUs, aggravation or s„pp««o„ „, .„ mJZT^ 
. -~-^^-7.-"--.of„ndes.«ec.»„^lagr«or:r:^ 

^'«^»'"l"^'«l='acel,isa„y,ype„,^,e,,e„«o„inanattenw,o,terlh. 

*«a..«lMlt»p«l«*»c«..«lby.can«re«l,»^ Tieatment i«*,des 

> 7«*^'««P''*nrWo.«y.««^t,^W^„o..pat,^^^^^^^ 
w* an e^ogic agent ^ ,«d in ^ 

p™ducethedestaleflecl.andn»yb.8l«ni„.i„gte„«„„^ « suflicent to 

*;~™™''==«■*»"='^=«"»«eo<cell..r«,a«»na^,c.|,,,„.^^„.,,^ 
^conipa^n purposes. ^ Wpose of «,e expe*«« .«ab«, . b.,. 

^ '^'-^''^f^r^vs^^,,^^^ ^^^^^^^ 
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compared with non-cancerous cells, and which may play a role In suonoHinn h, .• 

15 The tenr, "relaOve amount" is used where a comparison is made between a test 

measurementandacontrolmeasurement. Thus, the relatK^e amount of a reagent fon^^T 

-•»7^7^(^-*.-.«.*.»«^«^a™,,™„a«a.a:r«^^" 

«ne^ Thus, fe, example. -dMa^rtla, ..pe«i„„. ^ wten fte le«, of a^JTp^ 

Pa^-lar gene ,s h^ha, ,n on, ce. ..an ano.«,. ^ ^^TZ^ ^ a 

"^«'»-^«NA»om<^n,c^.„.«en„^,„,„,3'^'°;^^^ 
5 «n^amon,a..«.«,o«.. OI«^«a,».p^o,RNAIsoond«..,|p,.«;rjn^2^ 

2. CHe.2a13-1, CH13-2a,M. o, CH14-2,16.1 m », l«*M Ibmt « .So «nlwte an, sad, 
pol,naclao«delha.hasbee™o.onador«««fcWlnB.«l|».. an, such 

When ased In refemng lo iho gene screenlnj raMfto* o» Ihl. lm«,lion a, «»» 
«^a. ,n .ne .s, pa,„,ap.,, -Olspla^ oONA- . an, ^ , ^ 

ONA »P»s present in a -ete«.eN greater amoan, In a «na sample compaied «m . 
"n^. a ,.«l«, or .ea^ „nal c«^re. w» ^ J^ne Jl^ 
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du. to ft. dBterence in cop, nwnb.r. Separate o, diferan, cONA h . p,.pa«on 

CONA b««.„ d,»e«n, a p«ter^ „e*od of display is «,e ^ 

l«*nK,u.,.„d.*an«™«,»,,,.u,ond.,.«b« in,. is disctosura and elsewhere 

saparab. b, ,,anda,d «cKn»,u.a, ^ e.«^p^o,«, ^„ , ^^J^* 

endonudaasespecificforapa«lcul.rn«*K„id,s«,«,eeisp,.(eTOI 

-Hybridisns- in «,is context »!.„ « conttcfcg a srs, pofynodeolid. wn, , ^ 
= polyn.c.o«e under condiii^a *a, ^ ,h, fe™a.,o„ ^ , pc^nuH^aZ^ 
whenav^ one s.and Che pc,nucte««e Ms a sentence Cau«cie,,^Jt'ta 
tn. second pcn^a. T,» d.,ex ™, Pe a .n,*ed one. 3^,^^ ^ 
DNA ,s .«ed as a ..eted proPe ,o detect ano»ter ONA .otec*. ^ ^ J^Z 

b»,nd te a „^^«. ^ „ ,^„, ,„ 3 ^ TPe duplex ™, J J, 

ONA n^aoila. »^ ma an^ pcodpc . subse,^,^ dataced. Tna p,ac«^er n,a,l* e 
ccndlon, o, .ha ,e«=«on te alterthe daste. « co.pten«tad,y „ ,„„, 

»pecllic«y«maln. a d«.m*.n9fectorin the reaction. ■ « lonj as s«,pence 

Untess expllc% l«licat.d or otlH^ise ™<Mrad by the technique, used the steps of a 

ZZl — ~ where' r Z 

appropTtete. ,n one exa„*te. in the method conipn^n, steps a, .h™*, h, that Is das«b«, 
y . » en.,^y approphate .o conduct steps a, te c, o„M ntethod «„r bL or 
e) to 9) C the ^thod, as lon, as the cDNA „«™te, sheeted tt^ c«.re ^^a^ 
and «^ h, ,„ ,n.^ exan^le. s^ni,^ ^3,, ,^Z. l 

..patate*,™, optional, b, do™ at tfte same „n». M pam«a.J 0, rjl. 
wthin the scope of the invention. 

Gene/a/ metfiocfe 

^'^«'^P'««*»»>to«l,,^.untes,otha««„ 

-«o,.aa. S^tec.„,uesateexp.*ted^.tb.«.,.te,.. S,.,teri:r-Zr: 
A Labotate., ,te„^p. ^ ^^^^^^^^^^^^^^^ ^ 

Synthe^- ,„.,. 3a, ,d.. ,984,, -Anin^l Ce. Culture- ,«.l. PreTn 2 
Mot «!... ,987), -Cu™. Pwtoool. m ««*cular Botogy- (F.M. Ausubel e, al eds ,987,. and 
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Features of the cancer gene screening method 

5 

The cancer gene screening methods of this invention may be brouaht tn ho« ♦ w 

DNAduplraHon and RNA overabundance rrtalta to me same -n,- , . '° 

-reguentV ^ cancer, me p,esen„, av*.b,e ,e*n^ ^^Z^TZT', 
re^ ln.o,«d in me even,, no, me epec»c ^. Jd T.^^rT' 

P^«« a we, 0, de^cn, ,enes ma. .a, be p«e.n. en e^^r ZTaTrT 
B^.» a^eany pan ofme -nemod invokes de.ec«n9 RNA. m. n,e,^ .v„,de ,ene. m^ 
b. -up^ ,„ an an^on bu. a« ^«scen. (and me,eto,e Inelevan,, .„ 

««te«ableb,metta,niqu«,usedtodesc*.an,p»con.. mo »na« to be 

Near m, heart of mis appnsw, ere s»«al concepte. One Is ,ha, oenes encodino 
P|od«s^,^todpos«.e,.men«,n«p™ese.c.^e^^^^:^^^;^' 
. mal^nan, ,«„sto™e«on. m m. contox^ -^e expression- re*„ to expreL « me 
^nscnpt^n ^e.. Mos. .pica,,. «» ^ In .„™ be ^s„tod ^ a ^ »„, ^^^,2 
»z,n».ctand.g.or«,„«^a=^»,^,nincrees,sa^rn»^.,«^ '^'^ 

™xan^,m.RNAn»yencodeo,partlc,pa»aeanbozyme.^s.po,y„„cl^^^ 
ome,^nctona,nuc,e*=ec«n«>to«edunn!,n«,^nanc. m , mm. «<.„,pto. R^„ Session J 
l«.ncrten«bu,.ympto™«cofanlmportan,even,lnn„sft™aUon ""^-^ 
Anome, ooncep. is met ov^expression. » cenire, to mallgnen, .renstonnaton n«y be 
,n diftren, ^„to« b, dl«a™« mech^. and ma, a, leas, one sucn 

«. an^Pton. or dap««ed region of . a.o,„osome. m« mcudee wimin compose Z 

ZT^r. - -V «^ RNA overebuidanc 

yy. soch as by increasin, o,^^ ^ ^ 

l—otor reg^n). by enhancing Iranscrip, „„^ „ ^ „^ ^ 

Thus, me memcd entolls screenm, « me RNA levrt. ««« cancer od. line, or lumore 
»d severe, nonnalceai^s or samples « me seme mne. RNA are .electod «a, a 
»««»..teva«onamon^mecanc,rce,lsasccmparedwimnonne,ce«s. Addl«on- eWtotfes 
2^ .mptoy«l m combinason wim me RNA screening to Impreve m. s«ce«= re,, of m. 
"-thod. on. such «retosy I. to „«, ..verel cancer ce« imes me, are a, known to h«« dupllc««l 
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9e„es in tt,e same rejlor of a partlcla, d^omosome^ th. RNA tton »» 

« mo™ llteh, «. rapresa., a da«6.^ o.e™,presa«,„ a.a„t and o»«xp«s,ed ,e« is 

-7'«7'*-=»«""«asco„,™,s,orP,sa.,i„e.«pressio. T.is a.o,.s se^ 0, 
5 ma, ma, ^ axp^ss*. ,av^ ^, as a .asu„ o, «ss„e cu„u*,^ Ano„,e, supp Je^ 
s-ratogy ,s ,0 =on.« ,„ a.d«c™, ,.,0, 0, u^^„, ,,^^^„ 

-^xp™«^ RN^ !.» se^ RNA use. ,0 sc^n ONA „om su^Pie c^ca, ceiis a^' 

z:;::^:'^"*"""°~°'"»"^'*------'-^-a,o, 

" .evajirrrr""'^'^*"'""^'"'^-'-'-- 

The to, part o,„a meM is Pased o„ a sa»«, fcr p««c„* RNA. «a, a« ««,abun*n, 
.n cancer ceite. A w l,„ova«„„ 0, me me»K,d is ,0 compare RNA ,,»««,„ p««e.„ 

« «»9™n.s«»,eme„einagreateram=^,inseve,a,di,emn,cancer,l„es, but^incJlr 
«. more „ r.^ ^ ^, ,„ ^ ™=. 

Ni«un<^se»^„„^^^ «.spad,cu,ar,p,efe„ed«,us.canrcl 
lta,a,.lammtosl>a»aco™«»,dup«caMd»omosonalragion 

^°'"-^*™".o*. ■n».»a^„ reason., or r 
F.« me tssue - pm«e me speomm 0, expresston m« I. ^rpta, « .„ np^,, ^ 
«^.r man individua, d,«^nc.s m« Pe<».„ ^ ^ ^ 

m. e^s ma, . *o ^ ^ ^ „ ^ ^ ' ^ 

^'^•^'»^-9'-m,acBrsma,Paco™up^„«^.^. 

A mM mnov-ion o( mn «|bod is » andarteke a s*salec«p„ fcr cDNA correspondlne B 

«« m* RNA =«„.^„p. , . .„p«^, ^ 0. j'^.nn^: 

rar^'" ""^ •» - a pane, 0, di^^lfc^^ 

cete, and nom«„ano™,c dNA. cONA m« .h<«. *«ence « hi,,., copy „„mb«s Z 

s.ep is ma, cONA oor,esp»«,i„g ^ mioctendd., ,en„ o»„apldt„„ sc,ee.«, «^ pyLwino 
^m«*ond«a,DNAd^asanad<.,^.sa„,.,p,„s^^p^. 

. "»'«»^ecDNA«hichomen™.ma,«„pam,Jori«,o,mecONA,den™ed 
,^'*''*™««l="'>'8»"«>ie«in9p,oducbma,a,,p,«en,a, 
««™l**«<by.meftodcompdsedo,m.,0llo«in8sttps. ""•I'wels.s 
To ld«»liy partta,!,, RNA m« Is o««u„dan, In cancer cells. RNA Is prepared from bom 

o,„c«.ssoo.»d„n.sUC:r 
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metabolism by any one of a number of mechanisms For ey«mni« .k 

The serening ^ «» «». 

« a» ,tep in (he methoa. It Is pa*»*dy corned, ^ ™ , "^^""^ 

=on»olee»popul.llon,witl„>««p^out™RMA. '»™"'™"^^-"™"-=ancera« 

In terms of the cancer cells used as an RKfA e/^.ir^« » • 

■rt™™™ Of two, mw p^toaM, al teM, m,^ cance, cell. a« u J^r, 

^ the ™i,g„a„cy „, ..e ca™«, Lt^^^ 

S^m. IS pp*.,^ a s„». ^ „, a "^^2 

W* hav, fc„d IN. „ ..ec, ^ ^,„,„„,^ ^ej^^s^ 
no™, ™»MC ^ J^TZr 

' "7=^"-^"«V.b«o««,.,cp„*^^^^„„^^^™^; 
< con*l.rably minimla. M cWef drttoenc, In ». «e pf rna ™„= ■ 

num6=ro,«seHx»l«.es,w.«,«chnl,„«^. 

Shared duplicated reglpn. In canc«- Mis may be mmu by . rehv« 
techn,™. or p, «^„c« . ana^sls a-^ad, c««u«.. ^ ^ 
has been h,h, e«s*e . n«p^, app.x^ subch™n,so™« ^TZZ 
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DNAo„meehro™son«MW0 93/ieia6; s,a„,a,.). The greater *e signal oh«„ 
^ »,e greater cop, „^ of »,e sequences In .he ce». T.as. regll ellng 
elevatecl e«<nl„g oonespo«l ,o genes .upllca« « ,ne cancer ce,.. ,*,e region, slx»*,g 
d^ished naming correspond ,0 gene, deleled In ,he cancer can. ReiaM schni,™, which a 
5 in .he «, »l, e. w« aware are „e»»ds for preparing aM us^g repja, sj!^ t. 

chron,o,o,,«>ped«c nudeic acid pn*es (US 5.427.932; Weier e. aL). me»,cds to sBlning .a„« 
Chror^soma, 0^« «^ labetod „««c .o« »„,,^« m coniunCon ^ „oc.„g 
con,pte™n.«y ,0 rep,,^ DNA ,US 5,447,84,; Gra, e, al.,, and .emods I JJl 

10 (US 5.472.842; s.okke e. al). If desired. n«1»ple Ikiorodtrome. be «ed „ labelino Lni. 
203H and re,a.ed »cn„.„es. «. p..^. a .re.^ ^ detaLd^l?:: 

duplicated chromosome abnormarifes (Lucas e.al.). "nai. ana 

The choice Of a pa,«cular chromosomal mapping app,o«* I, l„ei«,aht, e««lall, once 
X^oi^ge o, .he dup,^ .eglon is Known. If .he ^cation o, .he chn^som. IpJi^" 

-e alrea^^, e«ahl^hed br a c^ «„e .0 he used RNA compa.son du*g .he course 0, 1 prl" 

InT T" •^"'"^ avalla^e in .e pu* Zt' 

Proved ,n .he ,««o„ of W. apptea^ I. a lis. of o^r 40 ancles in which to 

foc«.«» 0, dupiicled „^ p.«^,„ ca^e, c.^ .„ ,„ ^ ^ 

lT?r'T''''"''''°'*^"*'''^*'"»"^»^^"-°"-*date 
so ^a. .he, share a duplicated chh».««« ^ ^,„^ ^.^ ^ , 
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The ca™=er cells used .or RNA comparison are also genera., (bu, n« ^ 
cancer or «te same «ssue. Using cells deri«d from Ihe s»n. Wc»«r 
~ «. P«««blll., lha. 0,e gene u«imate» idenliSed wK be common in Iha. nTpTof ^ 
."d ^ a. . di^nos^ m=«er Us^g OeHved »om d^ ^ 

eflecta .«ch fc,c-^^«.ge.,.s^,.,„ ^^^^ „^ ^^^^ 

^ h-,,-™. P««» ih .«»B,. B«h w« Of ,««.,.., Uteres, for bCh diagnosii c a^ 

mr^reas, cancer C llna. eT474, SKSR3, „CF7. which d«..n*ed b, C<,Z 

Sou^em analysis «, share a dup«ca«, ,.™Kic ragton. In eh™™«,™. „, ,7 »«, 20 

rr r ""T . 0, .NA war. fbuh. . be ^1 

cancer cells, bu, no, consols (Fi^ ,, Th« rna o«™bu„dan, In m„^^^ 

Uted ih Table 1. The chromosome 13 gene (CH13-2a,2-1) was o«,exp,««<, m 2 ot«,e 3 cdl 
^-eV BT474 ^ SKB«3. Sou^m ah^s. subseguen^ es«b»s ^ 
chranosom. 13 gene was dupteatKl In the same hw cell lines (Example 6. Table 5). 
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Selection of the source or sources of control cell RNA is also » maHo, * 

canoe, o^. a. ^s. a. .n. ,e«, „ rna abundance. Hence, con^nson o, ,n, RNAlr,n 
canoerce*»*s.«„co„«^^„,^^^^^^_^^ ~ 

expenn»„l an. e«„ e^.«; ^ 

in . °" """^'^ Exan^ 1, RNA wa, seined »,„,t^ 

P«»«e,a,ng and noni,rol(fc,a«„g cm. As s«ed, s«een»g of b,ea« oanoe, rnT! 
P«.e«., cond^ „„c„.„^ ^, ^ (^nned^ol^ 

3;^«^^-=--™.-c«ained,,o..,,^„^„..^'L*'» 

Th, RNA I. p™«™«, « ^ i„ »e companson expe«„«n, ,„ such a way u, minln«e 

20 0^«.„. PC, ..^ ^.on. , . co™-«„, to u.. RNA ^ ^ ^„ „«3,n« 1 sit 
™ C .n«or «^ ^ ^ ^ 

P«pam.genou9hRNA«>ltot«canbope.,an«dl„aliquol.. 

Fc, d«pB„ng ,ela«vc o»«bundanc of RNA i„ me cancer orts. compared ««, 
c«™, cete, n»„y sandard ted,n^ues are su«ab... Tha* wodd Wud. an, fb™ of su«™o«« 
^ybnd«a»on or cc„,para«,e anah^ls. Prefened are »chn„ue. in »»lcb n«™ «an ,wo RNA 
sources are con,pared a. »„ san» ^. such as vanou, ^pes of arWra,., pH™d PGR 
ft««p™*9 techr^ ,W«sn e. „.. Vosh^awa e, a,.,. Pa^d, pre^ a. d«^ 
"«NA dSpb, n«hods and «da«ons hereof. i„ ^ ^ ^ ^„ ^ ^ ^ ^ 

ITT.°^ Tl»s. tectal,^ „ ^ ^ ^ ^ ^ ^ 

to the poly^ tail cfoaolafisllc of mRNA (Uans « ar. , 1992a; U S. Patent 5 262 31 1 ) 

one tme. » „ p.e,»abte to »^ „e ,e*l*», Of th. dbpla, .u„«,^, on,, 3 
RNAatatmie. M«ltods tor acconWaltin, Ihl. ar. i™wn in the art A prrtened n«hod Is 
.is,ngselectl,epflm«sthatWtiat.PCR,epticaltonlbra,ubsetoflhaRNA. Thu^ the rna is 
r»«se transcdt^d standard techniques. Short p,*™„ „e ««. torth. s«.c«on, prrte™,* 
Chosen such that altemath* primes used in a sari., o. tik. assa,. can co,„pl«. a comp«en,h» 
survey Of the mRNA. 

In a preferred example, primers can be used (or the 3' region of the mRNAs which have an 
oligo^ sequence, followed by two other nucleotides (TiNM. where i « 11. n e {A.C.G}. and M e 
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{A,C.G,m Thus. 12 possible pnn»« ane «qulre<l to com^ a,e s„™ey. A ran«,m or art,ta„ 
pr.™, 0, mmima, length can »»n be usM to, ™p«ca«on to»aMs wha, «„e,po„ds in J 
serene. ,o me s reg.n o, me .RNA. The op.™, ^ ^ ^ ^ , 
nucleoMes. The p«d« of <he PCR «acfcn I, labeled „i,h a laeioisotope such as »s The 
5 labeled cDNA is .hen sepa^ed b, nK»ecu.a, weight such as on a po^acy^„,.e se<,uenC^ ge, 
If desired, vartaHons on the differential displa, technique ma, be employed. For example 
on^b^e olSo^fT phner, ma, be used (Uang et a,.. ,993 S ,994,. although this is generally less 
Pr*r« because Ihe dismay pattern is co«,pc„di,,l, n«« con^tex. Se^ o, pernors may 
^ cpbmoed mathemati^liy depending on Ihe number of RNA species in a tissue o, interes 
» ^ue, et al, The method may be ad^ fp, n,^,^ ge.. and tor ^ ^ 
C^Asequencers,^,^.,^.,, Mern,,^^^^^^ 

e. a , may be used tor labeling »,e differenbai display. DUfere,*., d^play may optiooa.^ Z 

.ncorporate a restnct™, s«e to facHitate cton^ (Linskens et al.. Ayata e. al, uL Z 
1S po^meras. ,*om mult^le mahufacture. can increase me amount o, ^ u der Zv^I 
2^ coh^bon. (Haag et Nested PCR phmers may be used in d»tore„« d.pZt 
dtZirr . "e"-" ^ o^-dT ,».™,s two 9«3760,. Other .ahents of the 
diffe^^a, d^play are toH«n ,» the a« and deschbed »«er a»e in the retorences cited . 

20 Z^°' ™*;"°""*™*'^=-*'--«°'«'eP-entin»n.ton.bu,a.e 
20 not requred. as evidenced by Ihe examples described beto». 

Based on the comparisn, olrtfati« «»mda,« « rna. partteula, rna, „ chose« which 
-present as a high, propor^on of me RNA ,„ camera., <»«s. Wa«d wim c^ 1 

2lTh : T*' •» ■"'•^ "NA Will 

25 w* me proportonal lntens»y in me control lanes. Desired cDNAs can be receded most dlLh, 
^ =«n, me spot to the ,e, correspond^, to the be«. and ^cover,^ me DNAsTretT 
«««« CD.^ be repl^ted aga. tor ,«,er use by an, tachn^ue or cc^toTTo, 
techn«f», iTOwn l» me aMnctoding PCR and donlng into a suitable earner 

An opttonal but highly benelicial addttonal aoeening step typlcalh, o«<o^ 

:r"* Th. . conduced by usLg'l. 

^hce^s oe.,. suoh a, cancer ce« ^s. Ch,»«s«™, DNA »dm non.^, cete t,^ 
«-<«y««.otsmegennlin.intem»of,e„ec.py™^i.^^^^ 

obtanaNe. The ONA samples »eclea«d at sequence^pecilicstes along m.ch,omo«>n» mo« 
«r«able re«,^ enzyme into ^agments of appmpr^to Le. iZToanT 
«ed di^c, ^ 3 ^ ^ „ ^ ^ ^ ^ bTJng t^ 

prefened, bec^ee . enabto. e rx^son 0, ma hybhd^g ch^mos-x.,, «^ 
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5 .*m„^,^ one ™.ho<, Is ,o adm^isle, , second ^ ^ ^ ^ Pco«ng fcr , ,3 
„ne „„*e,y » ^ d*,ca«d ,n ,Ha cancer ce,,s. TH. , p.led 

no. onv to, dlSe^nces . ... an«.„, c DNA pro.«ed. a J tor dl^ 

To eltainate cONA ftr mtoctoWrial gen«,. , is p^ferabl. » ,„ , 

KyOnd.es ,„ .He app^prtate n«ocHondHa. ™sB»«on »a,n»,s c«, ^ s.^ 

corresponding to a mitochondrial gene. "pecieo or 

RNA s^^ance. TN», cop^ ^ segn^n, „.y o. . ^« o. 

by „ ^ ■ 

1 ^r*"^ ""^"^ RN^ 0, to iso^ c»npte^,a^ DNA a 1a 

.bmn- of jb. s.™, a^ «» !««-, is denvad ftom ,be san» 6ss„e so„n:e and 

md« p™*™«y cncr c«l f„ 

l».m«, 0«a„ , is d.d«d (h»n b™« cc ce» ,«e BT4r4 

constructed in lambda GT1 0. * 

samn.. T"""' - submitting the 

ample to commerce, sequencing services. The chromosomal locations of the genes can be 
determined by any one of several methods Known in the art. such as In situ hybrld.a«on using 
Chromosomal smears, or panels of soma«c cell hybhds of known chromosomal composition 

The cDNA obtained through the selection process outlined can then be tested against a 
^er panel of cancer cell lines and/or fresh tumor cells to detem,ine what proportion of the ceils 
have duplicated the gene. This can be accomplished by using the cDNA as^a probe for 
^mosoma.DNAd.ges.s.asdescribedeariter. As iUust^ted In the Exampte section, a preferred 
method for conducting this detamfiination is Southern analysis. 

The cDI^ can also be used to determine what proportion of the cells have RNA 
overabundance. This can be accomplished by standard techniques, such as slot blots or blots of 
agarose gels, using whole RNA or messenger RNA ftom each of the cells in the panel The blots 
are then probed with the cDNA using standard techniques. ,t is preferable to provide an intemal 
toadrng and blot«ng control for this analysis. A preferred method is to re-probe the same blot for 
transcripts of a gene likely to be present in about the same level in all cells of the same type such 
as the gene for a cytoskeletal protein. Thus, a preferred second pmbe is the cDNA tor beta-a'ctin 
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Using a novel cDNA found by this selection procedure it is antidnated iho» 
cancer cells showing gene dup,ica«on will also show RNA ove^^da ^h .^^^^ ' 
RNAove^hundancewithoutgeneduplication °" ""^ will show 

5 ^irrZilr^^^^^^ ~ - are 

.enes that are deleted arrsi: T V^^^" 
essen«a„ the sa^. cenes that a. .e,uen«y down-regulat J H^nce " ^^^^^^^^^^ 
suppresser genes, n.y ^ down-regulated by different .echanis^s in diCt l anTa 

p—Tp::^^^^^^^^^^^^ -on. .A is 

Preparat^n^orncontrolcells. Again, itishigh^pre.::^^^^^^^^^ 
gene in the same chromosomal region in order to focu. h «» 
15 particular alterations in cancer cells an. 1 ! ' "'''^ 

a reas, ^ ,p^fe,3bl, more) of *. ^1 . in 

^l«.s,caacarce,.a„.co„«,ce,,. Tl^a^ntJ^eTL^ 

•*-*™ or P««. 0, ,0 ™,cteo^ we^Zt, ? 

^»pa«».^^«,„o„.,^,^^"^'^ P™*« was 
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0 Further description of the actual exDerimentei *k * 

^0. exe.p.a. .enes. an. sequence .ataTZlT^ ch^^^^^^ °' 
2a16.l areprovidedintheExarrtptesectlon. CH14- 

Pnapa/atfon of polynucleotides, polypeptides and antibodies 

I 

Polynucleotides t«sed on the cDNA of CH1-9al1-2. CH8-2a13-1 CH13-2a12 1 CHi. 

3.ra« c *e to.^^ cDNA can te ope«», to a ZCo^TZt^lrT 

~to=a..an.,.po..p«e,3a^„^™":o"™r:" 
determine the polynucleotide sequence of the cDNA anrt nr^^ «, . «"veniem method is to 

Antbodte against poM»p^ 
»»a«. --"XK^r p^duc*,, h an an,™,. , I. oten 

combing w«, an a<.lu«„,, sucn a. F^nf. ^. ^ ^ 

«^an™„^aaa««„a,«ptorp«pa^„„„p.^^. ..^H^: 

«« ImmunizM animals provide a sovree of polyclonal anIilKXIla, 
D«-.-p™ca.u„to,pvn,,^sp.^3n.^3c^^,^^,j;>^ 
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art. Unwanted activity cross-reacting with other antigensjf present, can be re™^^^ bv 
runnrng the preparation over adsorbants made of those antigens attached to a solid phase and 
collecting the untx^und frac«on. If desired, the specific antibody actMty can be further purified by such 
techniques as p^tein A chromatography, ammonium sulfate precipitation, ion exchange 
ch«,matog,aphy. high-performance lh,uW chromatography and immunoaffinity chromatography on a 
column of the Immur)l2ing polypeptide coupled to a solid support 

Alternativehr. immune cells such as splenocytes can be recovered from tiie immunized 
an.mals and used to prepare a monoclonal antibody-producing cell line. See. for example Harrow k 

« *„« and ou«u«d. a™, ao«, a« »tec,ed «a, p,o*«. artibod, o, *e sped** 
spec, ca. pe*™ed on a^. supernatant by, n^ « .ec«J^ 

ia -^'^—sp^ypeptd.as^eda^c.n^.a^n.^as^dard^n^ayr^^ 

«^*».,c,„p.^»..„a,.,^,^„,«s„acu«,™sope„^„t„^l.^ 
««lof«iila«yBreparBdhMtartfmabl*cWwlft«»dTO^ <»"»>easMes 

^n»««so,s.„da,dp™,*c.«s^.a^„„^^ ^ 

p™«»*«c.nz,™. '3".««^ena™e«d™«an«of«,.«»^canb.p«,^byl,„»,a 

«.~«a .needing antbod,. ^ app^ ^ ^ ^ 

"Wmduce mutations and translate «» variant » """s, to 

!5 

Use in diagnosis 
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'*»^'*'"«l'»««""«N»nd^l09«».assodatadw» 

»d»^«d. SO«**P=VWi,te.«^b,su«,.„,s,a„d^Ood.,;«,fcb'C 
polypeplKl68.ar.alaopot«i«a«y,Ba(Ua.dlagnosllcaids. w cnrinese 

*k™sp««c-V.,«»d^«=«k„ 

le^ls Of RNA ccmspondin, to CH,.9a11.2. 0He^,3.12. CH13.2.12.1. and CH14.2.ie.1 «, 
P-=«^a«.^n.,a,p,cpo*no,.^s.ca^^ 

d««d c«,oe,, c«»r than breas, ca^ar. inctudh, ccta, c«,c.,. »^ ^ 
prostrate cancer, glioma, and ovarian cancer. 9 wncer. 
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For pafe« tineady aia^M »» cance,. g«» clup«ca«on or omabonto^ of RNA 
ass. cn^, „,„,,en^ ,,^„„3^ 0,^11; 
"S«L. pTMPctor « d«e,se su*al. mslasteis. .us=ep«H»y to ^ r^^ITlT f 
«e„»««,3„..es«,oo,^ca„cer.o„.3,„es.vJs sl^ZT-Z^Irr 
««o*a^ar.^„.„^^,^,„^„,_,^^^J^J-^^^^^ 

<l««d at a partate, 5.„e. a dla9,,«ic test specific fc, «„ ,a,« 

patients ili<el, to l»neflt from ll«plM,maceutlcai Gi«»ia5el«*,no,!.^ ' 

err— ■•"•-"■^•^===: 

a« ex.e™»»V known ^ «e ait, an. ar, ™^ to a p«^ Of ord^a,^ 
see sample. U.S. ^ ^. 4.«e.«» (S,^ et an. PCT Appiica^. v» 9^, 

.0 ^'^'"-'"^""""'^^e.al, ^^.apnefnon^im^ZT- 
20 some of llielommprooeduias that can baappliad. "'•syoi 

'^'^'^•"«*^»-«a9no«i=me..xx.«,«,i„««,pn.oneol,,»c^ 
«««»n«prov»,das,,«^.o.«ec.a^^ac.nica,.an*w,*«,^.^^ ^ 

po^-«e««e of this in«n,ion can be use. as a .easem » a D.*^ or RNA ..^ Z, » 

^ motatie or t,» po,p.p»* , , «» oorrespon*, Jan*„„ o^ 

be a. . reagent to detect a ,a,B« i. spsdiV^ ,«^«z«t, ««H as the pol^peptid. >«« 
"TOMiosenloraiselt. •"'I'v^'osaxm 

'*''^'»'*»"'»'«-«n9a«*bleti.s,«,ampl.ln«,anW^^ 

^agno.^ pa-arneter . ,0 b, i^asur... Ra,e«« test ean*. „ ^ 
»^o^con.**,,ca„ce™„Pa^^^_,^ J^^^ 

»^te«^fcrth,sp«ppse,«<^*os.e««ep«a^^^3^^^^ 
suiSK^ai dissec^on. in *, cituree o, cdis de.,«, .berefton. btocd. and btood oompomi tf 
desire. ^ ta,^ may be padfled (torn tl» aan*. o, anpKW be*» tb, assay is 

^..ucts.. The..ac«onispe-fb,m.dbycontactir,the,e.g.m»|thtbesample,ndercondl6^ 
«atewaccmpto,„tomb««en,t,e,ea9entar«,t.«,^«. T>» ,.«>tion ,™y b. peHortned h 
«««»,orona»««tissuesa,np.e,IOr««nw..osin9.*to,o8y»K^^^ Tbe fcmtata, of the 

comptotsdehcadbyanumberoflechniquesknowolntbeat For example, the reag.« may be 
s^*.,*a,.bela,«™,.a^,^^^^^ JJ^^ 
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remaining label thereby indicating the amount of complex fom«d. Further details arul aKematives for 
complex detection are provided in the descriptions that follow. 

To detemiine whether the amount of complex fomied is representative of cancerous or non- 
cancerous cells, the assay resuft is compared with a similar assay conducted on a control sample It 
.s generally preferable to use a control sample which is from a non-cancerous source, and othenvise 
s.n.»ar in composition to the dinical sample being tested. However, any control sample may be 
suitable provided the relative amount of target In the control is known or can be used for comparLe 
purposes. Where the assay is be.g conducted on «ssue sec«ons. suitable control cells ^th nom«, 
histopathology may surround the cancerous cells being tested. It Is often preferabte to conduct the 
assay on the test sample and the control sampte simultaneously. However, if the amount of comptex 
formal .s c,uan«flabte and suffi.en«y consistent It IS acceptable t^ 
sample on different days or in different laboratories. 

A polynucleotide embodied in this invention can be used as a reagent for detem,lnlng gene 
dupl^a on or RNA overabundance that may be present in a .In^i sample. The bjg o I 

^^on between a reg»n of the po^nucleotide reagent and the DNA or RNA in a sample being 
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If desired, the nucleic add may be extracted f^m the sample, and may also be partially 
punfled. TO measure gene duplication, the preparation Is preferably enriched tor chromosomal D J 
to measure RNA overabundance, the preparation Is preferably enriched lor RHA. The targei 
poVnucleotKie can be optonaBy sut^ecfed to any comblna«on of add«lona. treatments. Indudig 
digesbon with restriction endonudeases. size separation, for exampfe by efect«,pho,Bsis in agaj 
orpolyacrylamide. and affixed to a reaction matrix, such as a blotting material 

Hybridization is allowed to occur by mixing the reagent polynucleotide with a sample 
s|«pected Of containing a target polynucleotide under appropriate reaction conditions. This may be 
tbitowed by washing or separation to remove unreacted reagent. Generally, both the target 
po^nudeotlde and the reagent must be at least par«y epuilibrated into the sing^st^nded fJm 
ort^ for oomplementan, sequences to hybridize efUdently. Thus, it may be useful (particulariy In 

testsforDNAJfopreparethesampfetystandardden^n^ntechniques known in the art 

The minimum complementarity between the «agent sequence and the target sequence for a 
^.x fo fom, depends on ^ cc^it^s und. wh.h the con^.fom.ng reln Is allld to 
occur Sue conditions indude temperatuie. Ionic strength, time of incubation, the p^sence of 
addit«nal solutes in the readion mixture sud, as fom«mlde. and washing procedure Higher 
s^ngen^ conditions are those under whid, higher minimum comptementarity is required for stabte 
hybrid^ to occur. It Is generally preferable in diagnostic applications to incBase the speclffdty of 
rjT" Of the reagent po^nucteotide alternative undesired 

hyi^on |„ the san^fe. Thus. « . praferable to conduct the .action under cond.ons of 
h^h^ency: for example, in the presence of high temperatura. low sa.. fom«mkte. a combination 
ofthese.orfollowedbyatow^ltwash. «^naiion 
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in owe, to detect completes foored between the reagen, ana Ite terse. tl>e««n.» 
^ ^ e ^. S^.^^^^ ,„ ^ C 

- P^o^'^'^se the, a™ capa.te o, p„.„.,„g a cCo^d or p^dTT 

connect- »«^h a »ne. « •,tenr««e ,e«^ ™,ec„.s, eucn as a t,to«n^, „^ro,^ 
0. ,n»,^ povtt^. ™, ^ , ^, ^ ^ 

»*thetaigrtpolynucleoli(lB.oralte<waid8. 

/°''^'^"™'»««''''^««*.««otend«rtbte,o increase tlte signal ensaino 
reagent po^™ce««.. s«* ., b, a po^n^ ^ ^ „„„.,„^,^ ^ 
sena.y .ybnd«ng poVn«^ „ bra^^ed ^n„c*o.«.. c^^^l^, " J 
n«^e^^.con^Pec»„»^^^^ S.. U.S. P..^ JsZ." 

*"™«'»*«"l»«>«-l"«sln»ntioncanalsobe„sedasanasi««l„«ncerdl.S^ or 
^ d«e™in9 9.ne dupfcat™ or RNA overabundance that n«y be p,ese„t In a CnJ.,.^ 

p™d«,m 0, tb. co,«pond»>s pcypeptue. Se«,al o, tbe genes „p^„,ated ,„ cancer oe*, 

by the cell into ttie surrounding mifeu. 

Any such pK>tein p^duct can be detected in soW tissue samples and cultured ceBs by 
.mmunoh-stcogical techniques that will be obvious to a practitioner of ordinary skill. Generalh. the 

Zt sTr« ' """""""" °' ^"'"''"^ "^^^ ex^anging into 

2r 1 T "^'^ " Parafo^ldehyde. or embedd^g in a conu^Lal, 

a^a-abte n^.u. such as paraffin or OCT. Ase^ 
with a pnmaiy antibody specific for the protein. 

The primary antibody may be provided directly with a suitable label. Motb frequently the 
pnmary andbody Is detected using one of a number of developing reagents which are easily produced 
ora>«.tebtecommerc«l^. Typk««y..hesedeveio^„grea^^ 

and they typlcaiy bear labels which include, but are not limited to: fluorescent marKers such as 
fluorescein, enzymes such as peroxidase that are capabte of precipiteting a suitebte chen^cai 
compound, electron dense markers such as coltoldal gold, or radtolsotopes such as '«| The section 
|s Jhen visual^ed using an appropriate microscopfc technk^ue. and the level of labeling Is compared 
between the suspected cancer cell and a control cell, such as cells surrounding the tumor aroa or 
those taken from an alternative site. 

■^^^'""""^^^P'otein corresponding to the cancer-associated gene may be detected « 
stendam quantitetive immunoassay. If the protein is secreted or shed from the cell in any appreciable 
amount It may be detectable in ptesma or semm samples. Altematively. the terget protein may be 
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aflUMto.sol«lpl««,sucl,«6,abtottech„i,„e»usl„gaoapft,,eantibod,. ■ 

for «a,„pte, «» p™«„ „», b. mi»d w* a p«^ete™»,ed „o«™fcg a J„, 0,11^^; 

s ^•■»^'"'^-avco„.,„a.«:aZ:::^:r 

an enzym. „ . ..^^ c . ^ ^ ^ a.ded ,1- 

'0 pos^vav . ^ a™« „, ^^^^^^ s^A .alT^l: 

...rr-.rLTrrrtci'zrar:--^' 

P0>,pe^. Thus, a „u™,»e, o, ^ ^,«o^ ^.^1^^ " 

human IgG antibody molecules present in a semm 

detecting reagent such as lat«led anlHmmunoglotHilln The amc^nt of 
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rTnTlT"* """'"'"^ <"ttec«.po«nt ocan .te«on in 

«» ton,, Of *e coupon,™. comp^neC ««, ,ha, ,„ a sampte torn a ^ea«hy t,dMa.al. The chtol 
«mpl. s optio^lV prrtealed fc, endchmen. of b^, being „s,e<. The user „pfes a 

con..in«. in «, in on*, » detec. ^ cn,ns«, o, a«=,a,ion ,n Jn^ 
component y"W5>uc 

Each kit necessarily comprises the reagent which renders the prcxedure specific- a reagent 
po^nucteotide. used for detec«ng target DNA or RNA; a reagent anybody, used for detecting t^ge 
pn.e.n: or a reagent potypeptkle. used for detecting target antitx,dy that may be present in a sampie 
to to analyzed. The reagent « supplted in a solid fom, or HquW buffer that is suitable for inventory 
storage, and later for exchange or addition into the reactfon medium when the test is performed 
Su abte packaging is provided. The krt may optionally provide additional components that are usefo, 
.n the procedure. These optional components include buffers, capture reagents, devetoping reagents 

HtL" «e 
iJsBtnphamaemitlealelwelopmmt 

Embodted In this Invention are modes of treating subjects bearing cancer cells that have 
overabundance of the partfeular RNA described. The strategy used to obtain the cDNAs provided In 
tt..s mvention was deliberate^ focused on genes that achieve RNA overabundance by gene 
dupLcatfon in some cells, and by altemaBve mechanisms in other cells. These altemative 
mechanisms may include, for exampte. transtocation or enhancement of trar^criptton enhancing 
etements near the coding regk»i of the gene, detedon of repressor binding sites. oraBered production 
of gene regutetors. Such mechanisms would resurt in more RNA being transcribed f^m the same 
gene. Alternatively, the same amount of RNA may be transcribed, but may persist longer in the ceH 
.esulling in greater abundance. This could occur, for example, by redaction in the level of ribozymes 
or protein enzymes that degrade RNA. or in the modiffcation of the RNA to render it more resistant to 
such enzymes or spontaneous degradation. 

Thus, diffterent cells make use of at least two different mechanisms to achieve a single result 
A the overabundance of a particular RNA. This suggests that RNA overabundance of these genes is 
cental to the cancer process in the affected ceOa Interfenng with the specific gene or gene product 
would consequently mod«y the cancer process. It is an objective of this invention to provide 
pharmaceutteal compositions that enable therapy of this kind. 

One way this inventkm achieves this oljjective is through screening candidate drugs The 
general screening strategy is to apply the candidate to a manifestation of a gene associated with 
cancer, and then detem,ine whether the effect Is beneficial and specific. For example, a composition 
that interferes with a polynucleotide or polypeptkfe conesponding any of the novel cancer^sodated 
genes described herein has the potential to bk«k the assodated pathotogy when administered to a 
tumor Of the appropriate phenotype. It is not necessary that the mechanism of interference be known; 
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onfy that the interference t,e preferential for cancerous cells (or cells near the cancer site) but not 
Other cells. 

A prefe^l method of scr«nl„9 is lo pravMe ells In which a oolyrudeoMe retelM to a 
P|epa.,^ "-^ v«c. soc as a ^ vec^, ^ ^ ^ 

*;=^'™''='««'"'*''*'«>P'»™*»<^bl.iotes«n3,a™,„h^ 
10 w,.,„cu„... The calM^ . ^ 3 po^uceo^* c<»«s^^ „ 

15 «'^«'^O"iytha.tha,™fec»0h«su«inasubsBn,«i,,c«asei„,he,e«^ 

^^-^^^ . — ^ ^ sa. po„., 3.^^ , ^ : 

'''^«««*8*P«*m«byad<«n9a«*cahdldalaloas^ 
h»^,^ .he e^ect ^h. 3 p«a« san^ ^ „ ^ 

It ? » .hen =0^ ^ an, s„«3t^ 

^ e™„, *e ^, 0, a pa,ta«, RNA or po,p.p«a assoc^ 

«h 0*, can, „ co„,po„,xis. Memncas »e^«.n ««e<. and .h.™a« i^ 

5 «^»'^«-««..l"ap,fe„ed™,«,,^a*c.o,^dr^o„«e=e«,«s^ 
^ po^n„c,ao«deiaa,s.co»,pa«»,.h,hee^onac^«^^ S„«aNe c»»o, ca^^ 

^«^Jh,.a™po,™«ao«,.„,^^ Op*™^. Jdh^haaa J^ 
effect on opwawimansfielad can. than on 001*01 cells. 

' '^''^°'**'^»"™«***^i««toaoa«ee.onanyphenolwl^ 

^ t.ns^ o, ce. 1^ «h ^ po^^ ^ ^ ^LZ'^^j:^ 
^tco^l,«ap«„^,^,,^^3,„3^^ »^ 

e«««p™eu,,o,»«,sthe..„=«on.l«*c,o,thep™t«n. The affect o,th.d,„, would .»app^I^ 
^co„«.n,«,„«.Oa,w.an«a,.da„dun.,ea««s. An»«„,.o,thes.^,tZr 
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Use in treatment 

This invenfion also provides gene-specific pharmaceuticals in which each of the 
polynucleotides, polypeptides, and antibodies embodied herein as a specific active ingredient in 
5 pharmaceutical compositions. Such compositions may decrease the pathology of cancer cells on 
their own. or render the cancer cells more susceptible to treatment by the non-specific agents, such 
as classical chemotherapy or radiation. 

An example of how polynucleotides embodied in this invention can be effectively used in 
treatment is gene therapy. See. for example, Morgan et al.. Culver et al.. and U.S. Patent No 
10 5.399.346 (French et al ). The general principle is to introduce the polynucleotide into a cancer cell in 
a pabent. and allow it to interfere with the expression of the corresponding gene, such as by 
complexing with the gene itself or with the RNA transcribed from the gene. Entry into the cell is 
facilitated by suitable techniques known in the art as providing the polynucleotide in the form of a 
suitable vector, or encapsulation of the polynucleotide in a liposome. The polynucleotide may be 
1 5 provided to the cancer site by an antigen-specific homing mechanism, or by direct injection. 

A preferred mode of gene therapy is to provide the polynucleotide in such a way that it will 
replicateinskJethecell. enhanclngand prolongingthe interference effect Thus, the polynucleotide is 
operably linked to a suitable promoter, such as the natural promoter of the corresponding gene a 

heterologouspromoterthat is intrinsically active in cancer cells, or a heterologous promoter*^^ 
20 be induced by a suitable agent Preferably, the construct is designed so that the polynucleotide 
sequence operably linked to the promoter is complementary to the sequence of the corresponding 
gene. Thus, once integrated into the cellular genome, the transcript of the administered 
polynucleotidewill be complementer to the transcript of the gene, and capable of hybridizing with it 
This approach is known as anti-sense therapy. See. for example. Culver et al. and Roth. 

25 T»^«"seofantibodiesembodiedinthisinventioninthetreatmentofcancerpartlyreliesonthe 
fact that genes that show RNA overabundance in cancer frequently encode celksuriace proteins 

Location of these proteinsat the cell surface may correspond to an importantbiologfcalfuncdonofm^ 
cancer cell, such as their interaction with other cells, the modulation of other cell-surface proteins, or 
triggering by an incoming cytokine. 

30 "^«««'^»«nism88uggestavarietyofwaysinwhkrf,aspeclflcantibodymaybeeffecttvem 
decreasing the pathologyof a cancer cell. For example. If the gene encodes for a growth receptor 
then an antibody that blocks the ligand binding site or causes endocytosis of the receptor would 
decrease the ability of the receptor to provide its signal to the cell. It is unnecessary to have 
Icnowledge of the mechanism beforehand; the effectiveness of a particular antibody can be predicted 

35 empirically by testing with cultured cancer cells expressing the corresponding protein. Monoclonal 
antbodies may be more effective in this fomi of cancer therapy if several different clones directed at 
differentdeterminantsof the same cancer-assodategene product are used in combination- see PCT 
application WO 94/00136 (Kasprzyk et al.). Such antibody treatment may directly decrease the 
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patfK,logy Of the cancercells. or render them more susceptibletonon-speci^^ 
as platinum (Lippman). 

ofe/*cto,c«npo„enB. l^eP™«np™,«aofB,.ca«er.«s«iated!,e.«is»xp,c,e.,oappearin 
l«3h ,«,„e«, „„ cn«r CI, con^ » ^ ^ <»e«b»dance of «,e 

cor.sp„nc.,n9RNA.Th.p™««,*.r«o™p™v««a™rk.rfbrca.c»ce,,s«,,a.p^^^ 

can t,„a ,0. A„ ,««o, compo™« «Bcted to th. an«body «„«,o™ b«=om« concenwted J 

ma cancer celfe, „„p™.|„g effect on mow „», and *e™asln, ,1,e affact 00 non^ce, cells 

.ha. have n,e»s.a3.e. ,o ofhe, «ss.e s«,. Fu*enn=re, ^ J 

er<kjcyt„s», will enhance erto- of the eflecto, into the cell interior 

For me p-rpos. of taiseung, an anlibod, specllic for the pn«ein of can«r^sweiaed 

^e.^iup.«h,.K.«e.ffec.co.„p„ent,p,efera.,p,aco.afen.o,h,^;.Zr 
*''»»^«*«»'Con,P-"»ln««*c«,,po.«ionslndudarad^^ 

« »"*a,»«ds«ne.,„d.oxi=p««d«^h..d,^«.r«.o* Other s*o^ effector co™ 
i^ud.p.p,«.,orpo,nuc..o«.^.o,.^.,»p,eno.^^,,^^,^^7;^^^^^^ 

In n»st app,^ of ««,ody mohctifa. hun«, .h«,p,, , „ p„,„„. „ ^ 
monoconals. or ,n«,»x,^ „,« Have been humane „ ,.chn.,,«, m „,. ' 
» P-«"''^-«hod,,™^,eether„,*es^.econ.n5.«,^o,«.,K»,.,™„„,»J;:"'*' 
An e^rnple of tK« poVpeptides en*odled in «,ls In^Moo «„ b. «fec««,, us^ ■„ 

•«*«n,,..hrot,,h»ccination.The9,o«hofcancercefels„atu,,lly,imi« 

an*..^ and T c«,. and conaa^en, .riggenn, of l,w„.r» effector ,„„c«ons that L tun^ 
rr?: '*"*^°'"»'"~"""*"'«'^ = P«-^""-^«ca„.3enl^ 
P=«;pep.de«™d.d by «» cONA 0. ^tioo ^ ^ J 
hav,n,ov.raP„,^o..,.»con«p,«,„,PNA ma, .to b. . ^pt^^e rTfe^ 
^ ^~'""P«P"«onP««.PO.-«»-~«op*«c,^..^w»ova«b.j;^ol^^^ 

MacieanTalT T"^ T "a^ - i™«n - B« « (Ba.,*,,,, 

•tecLean et al.). Fo, exainpfe. a,n,he*c antigens are conjugated to a eanteliKa Wyhoh 

,KUH). and then coniblnad with an adi-van, ^ ^ OETOX- . nltt 
2~l=-wa.sandl,*.A ^P0.pep.»e encoded bytt.^™,.,^,::^^ 
Il>««ivan»oncaiibewedin«ialo9ousconipositlons. "dasenDadin 

"^•'^P'«P-*«««"«mlnl«am,polypep«devacclnesarel,n«^ p_S^ 
KUf. Prata-ably. «» vaoon, a*o coa,pds«i an «»««,t «.* a, alum, n^^, 
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nposomes. or DETOX-. The vaccine may optionally comprise auxiliary substances such as wetting 
agents, emulsifying agents, and organic or inorganic salts or acids,, it also comprises a 
Pharmaceuticallyacceptableexcipientwhich is compatible with the active ingredient and appropriate 

fortherouteof administration. The desireddose for peptide vaccines is generallyfromlOMgtolmq 
With a broad effective .atitude. The vaccine is preferably administered first as a priming dose and 
then again as a boosting dose, usually at least four weeks later. Further boosting doses may be given 
to enhance the effect. The dose and its timing are usually determined by the person responsible for 
the treatnnenL 
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The foregoing detailed description provides, inter alia, a detailed explanation of how genes 
associated with cancer can be identified and their cDNA obtained. Polynucleotide sequences for 
CH1-9a11-2. CHS.2a13-1. CH13-2a12-1, and CH14-2a16-1 are provided. 

The sequence data listed in this application was obtained by two^irectional sequencing 
except Where indicated otherwise. The data are believed to be accurate- nevertheless, it is readily' 
appreciated that the techniquesof the artas used herein have the potentialof introducing occasional 
and mfrequent sequence em,rs. Clones and Inserts obtained via PGR may also comprise occasional 
errors .ntroduced during amplification. Nucleotide sequences predicted from database compilations 
and sequence data obtained by one^irectional sequencing may also contain occasional errors in' 
accordance with the limitations of the underlying techniques. In addition, allelic variations to both 
nucleotide and amino acid sequences may occur naturally or be deliberately Induced. Differencesof 
any of these types between the sequences provided herein and the invention as practiced may be 
present without departing from the spirit of the invention. 

SequencedataforCH8.2a13.1 and CH13.2a12-1 cDNA are believed to comprise the entire 
translated coding sequence, and 5' and 3' untranslated regions corresponding to those found in 
typx:al mRNA transcripts. Multiple mRNA transcripts may be found depending on the patterns of 
transcnpt processing in various cell types of interest. Sequence data for CH1.9al1-2 and 
CH14-2a16-1 CDNA comprise a portion of the coding sequence and y untranslated regions 
Additional sequence is typically prasent in the corrasponding mRNA transcripts, comprising an 
additional coding region in the N-temiinal direction of the protein, and possibly a 5' untranslated 
region. 

Certain embodiments of ttiis invention may be practiced by polynucleotide synthesis 
accommg to the data provided herein, by rescuing an appropriate insert corrasponding to the gene of 
interest from one of the deposits listed below, or by isolating a corrasponding polynucleotide fr<m a 
surtable tissue source. Various useful probes and primers for use in polynucleotide isolation ara 
provided herein, or may be designed from ttie sequence data. 
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i'^TCC). 1230, PTWawnDnv.. Rod™*. Mao-land 20852 uada„e™of^ Budapes„«aly The 
deposits are outlined in Table 2: -f'ss*' ireaiy. The 




BC6F1 

Accession No. 
98074 



Accession No. 
97595 




TABLE 2: ATCC Deposits 

Mixture of £. co// with recombinantplasmids of cDNA franmon.. 
associated with breast cancer The 8 r^nmhinZ^Z^l^^ °^ 

by plating on AmpicillinS?sand!SasS2S^^ 

using SP6 and T7 primers. ^^'^cting single colonies for analysisby PCR 



Gene 


Subclone 


Expected size of PCR product , 


CH1-9a11-2 


pch1-l.i 


1.1 kb 




pch1-2.5 


2.5 kb 


CH8-2al3-1 


pch8-600 


600 bp 




pch8-3k 


3.0 kb 1 




pch8-4k 


4.0 kb 


CH14-2a16-1 


pch 14-800 


800 kb 




pch14-1.6 


1.6 kb 




pch14.1.3 


1.3 kb 




BCGF-1orBCGF-2. '^'^^'^ ^•°'^CH14-2a16-1 not present In 

The following GenBank accession numbers am Hqiow » ^ 
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H88063; H88064; D61948; H88718; H28460; AA137920: AA145308; W12952: AA200687- N44164- 
T27279; dbSTS G22044; G04961 . ' 

The following GenBank accession numbers are listed in relation to CH8.2a13-l- dbNR 

083780 

The following GenBank accession numbers are listed in relation to CH13-2a12-1- dbNR 
U58090; dbEST AA182441; AA253924: M179755: AA112715: AA112640; W67977- AA150317 
W68080; AA150243: AA100446: WB9636: H46574; AA245889; AA100651; H77368' AA192778 
T85671; N32682; T86257: T78239; 777874; AA187866; 233557; R40816; N99802 R19302 
AA100650: N55904; AA257151 ; H77369; T79014. 

The following GenBank accession numbers are listed in relation to CH14-2a16-1- dbEST 
N64802; W56903; N31400; W95674; AA233551; AA233636: N24105; W03447- W25821 ■ AA233666- 
AA233647; N67843; D55778; T66839; N55370; N75650; M280736; H97110; 219643- H9125o' 
AA230765; R93089; T84665; W94857; R92873 

The examples presented below are provided as a further guide to a practitioner of ordinary 
skill in the art. and are not meant to be limiting in any way. 

Examples 

Example 1: Selecting cDNA for messenger RNA thetis overabundantin breast cancer cells 

Total RNA was isolated from each breast cancer ceN line or control cell by centrifugation 
through a gradient of guanidine isothiocyanate/CsCI. The RNA was treated with RNase^e DNase 
(Promega. Madison. WI). After extraction with pheno^chloroform. the RNA preparations were stored 
at -70 C. Oligo^T polynucleotides for priming at the 3' end of messenger RNA with the sequence 
T„NM (Where N e (A,C.G} and M e {A.C.G.T}) we,^ synthesized according to standard protocols 

Arbrtrarydecamerpolynucleotides(OPA01toOPA20) for priming towards the 5' end were purchased 
from Operon Biotechnotogy. Inc., Alameda, CA. 

The RNA was reverse^ranscribed using AMV reverse transcriptase (obtained from BRL) and 
an anchored oligcKlT primer in a volum^ of 20 ,L. according to the manufacturer's direcBons The 
rBacbon was incubated at 370C for 60 min and stopped by incubating at 950C for 5 min. The cDNA 
obtained was used immediately or stored frozen at -/(/"c. 

Differentia! display was conducted according to the foltowing procedure: 1 pL cDNA was 
replicatedin a total volume of 10 PCR mixture containing the appropriateT„NM sequence 0 5TM 
of a decamer primer. 200 TM dNTP. 5 TCI ("SHATP (Amersham). Taq polymerase buffer with 2 5 
mM MgCb and 0.3 unit Taq polymerase (Promega). Forty cycles were conducted in the following 
sequence: 94»C for 30 sec. 40Pc for 2 min. 72°C for 30 sec; and then the sample was incubated at 
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72»C for 5 min. The repficated cDNA was separated on a 6% polyacrylamide sequencing get After 
electrophoresis, the gel was dried and exposed to X-ray film. 

Theauto,adiogramwasanah«edforlat,eledcDNAthatwaspresentinlargarrelativeamount 
.n all Of the lanes corresponding to breast cancer cells, compared with all of the lanes corresponding 

5 o controlcells. ^'9- 1 provWes an example of an autoradiogram from such an experiment Lane 
.3 from non-proliferating nom,al breast cells; lane 2 is from proliferating normal breast cells- lanes 
3 to 5 are from breast cancer cell lines BT474. SKBR3. and MCF7. The left and right side shows 
he pattern obtained from experiments using the same T„NM sequence (T.AC), but two different 

10 tuZin '^"'"^ '"^'^ all three 

10 tumor lines compared with controls. 

The assay illustrated in Figure 1 was conducted using different combinations of digo^T 
pnmers and decamer primers. A number of differentially expressed bands were detected 
drfferent pnmer combinatons were used. However, not all differences seen initially were 
rep^xlucble after re-screening. We therefore routinely repeated each differential display for each 
pnmer combination. Only bands showing RNA overabundance in at least 2 experiments were 
selected for further analysis. ''penments were 

It is preferable to include in the differential display experiment RNA derived from uncultured 
on.^ mammary e^thelia, cells (termed .o^an.-.s",. These c^ls are obtained from surgfca. 
^mples resec ed from healthy breast «ssue. which are then coaxed apart by blunt disseL 
t^hn.pues and mild enzyme treatment. Using o^anolds as the negative control. 33 cDNA 
fragments were isolated from 15 displays. 

^■^co„Aa^coa^^„^„^^„,^^^ 
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CDNA fragments that were differentially expressed In the fashion described in Example 1 
were excised from the dried gd and extracted by boiling at 950C for 10 min. Eluted cDNA was 
-overed by ethanol pred^„. and replicated by PCR. The product was cloned Info the pc" 
vector using the TA ctoning system (Invitrogen). 

BT474 s^JfT "^'"^ '^"^ ^'^'^ ^ "^^^ <^'«er cel. lines 

BT474, SKBR3 and ZR-7«0 were used fo prepare Southern blots fo.^creen the cloned cDNA 

ft.gments.TheclonedcDNAfragmentswerelab^^^^ 

cellDNA IndK^tedthatthecorrespondinggene had been duplicatedin the 

CDNA probes were ateo used in Northern blots to verify that the corresponding RNA was 

overabundarrtin the appropriatecell tines. 

corre. J'. ^"'^'"^'^ Procedure 

oo^ponded to novel genes, a partial nucteotide sequence was obtained using M13 primers 

Each sequence was compared with the Known sequences In GenBanK. In mitia. experiments. 5 of 
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the mst 7 genes sequenced were mitochondrial genes. To avoid repeated isolation of 
mrtochondrial genes, subsequent screening expehments were done with additional lanes in the 
DNA blot analysis for EcoR. digested and Hinm digested mitochondrial ONA. Any cDNA fragment 
that hybndized to the appropriate mitochondrial restriction fragments was suspected of 
corresponding to a mitochondrial gene, and not analyzed further. 

From the 33 cDNA fragments detected from differential displays using organoid mRMA 12 

Three cDNA fa..ed to detect duplicated genes, and 3 appeared to correspond fo mitochondrial 
genes. Sequence analysis of the 6 suitable cDNA fragments showed no identity to any known 

To Obtain longer cDNA corresponding to the cDNA fragments with novel sequences the 
fragments were used as probes to screen a cDNA library from breast cancer cell Bne BT474 
constructed in lambda GT10. The longer cDNA obtained from lambda GT10 were sequenced 
us.ng lambda GT10 primers. The chromosomal locations of the cDNAs were de.em.ned using 
panels of somatic cell hybrids. ^ 

the 6 novel CDNA Identified so far have been processed in this fashion The 
probes used to obtain the 4 new breast cancer genes are shown in Table 3. 



20 



TABLE3: 


Prinwfs itsedlbr Diffarentlal Display 


cDNA 


Ollgo-dTprimer 


Artiitrafy primer 


1 CH1.9a11.2 


Ti,CC (SEQIDN0;9) 


SEQ ID NO: 11 


CHe-2a13.1 


T„AC (SEQIDNOrlO) 


SEQ ID NO; 12 


CH13-2a12-1 


Ti,AC (SEQIDNOrlO) 


SEQ ID N0:13 


CH14-2a16-1 


Ti,AC (SEQIDNO:10) 


SEQIDN0:14 | 



Examples: Using the cDNA to test panels of bnasteancwams 



25 



To determine the proportion of breast cancers in which the putative breast cancer genes 
were duplicated, or showed RNA overabundance without gene duplication, the four cDNA obtained 
according to the selection procedures described were used to probe a panel of breast cancer cell 
lines and prinrtary tumors. 

Gene duplication was detected either by Southern analysis or slot-blot analysis For 
Southern analysis. 10 w of EcoRi digested genomic DNA from different cell lines was 
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electrophoresedon 0.8% agarose and transferred to a HYBOND™ membrane (Amersham) The 
filters were hybridized with 32P-labeled cDNA for the putative breast cancer gene After an 
autoradiogramwas obtained, the probe was stripped and the blot was re-probed using a reference 
probetoadiustfordlfferencesinsampleloading. Either chromosome 2 probe D2S5 or chromosome 
21 probe D21S6 was used as a reference. Densities of the signals on the autoradiografns were 
obtained using a densitometer (Molecular Dynamics). The density ratio between the breast cancer 

gene and the reference gene was calculated for each sample. Two samplesof placental DNA digests 
were run in each Southern analysis as a control. 

For slot-blot analysis. 1 ^g of genomic DNA was denatured and slotted on the HYBOND™ 
membrane. D21S5orhumanrepetitivesequenceswereusedasrBferenceprobesfbrslotblots The 

density ratio between the breastcancergene and the referencegene was calculatedfbr each sample 
10-15 samples of placental DNA digests were used as control. Amongst the control samples the 
highest density .atio was set at 1.0. The density ratio of the tumor cell lines were standardized 
accordingly. An arbitranr cut-off for the standardized ratio (typically 1.3) was defined to identify 
samples in which the putative gene had been duplicated. Each of the cell lines in the breast cancer 
panel was scored positively or negatively for duplicationof the gene beingtested 

some of the cell lines In the panel were known to have duplicated chromosomal regions from 
comparatVe genomic hybHdization analysis. In Instances where the cDNA being used as probe 
mapped to the known amplified region, the cDNA indk:ated that the corresponding gene had also 
been duplicated. However, duplicated genes were also detected using each of the four cDNAs In 

mstanceswherecomparativegenomichybridizaaonhad not revealedany amplification 

Because of the nature of the technique, the standardized ratio calculated as described 
underestimates the gene copy number, although it is expected to rank In the same order For 

example, the standardizedratioobtainedfor the c-mycgene in the SKBR3breast cancer cell wasSO 
However, it is known that SKBR3 has approximately 50 copies of the c-myc gene 

TO test for overabundanceof RNA. 10 of total RNA from breast cancer cell lines or primary 
breast cancer tumors were etectrophoresed on 0.8% agarose in the presence of the denaturant 

32P.|^beled cDNA for the beta.c«n gene to adjust for differences in sample loading. Ratios of 
dens.t,es be^veen the candidate gene and the beta.c«n gene were calcutated. RNA from three 

dtfferentculturednonnalepithelialcells were included in the analysis asacontrolfbr the nom^^^ 
Of gene expression. The highest retio obtained from the normal cell samptes was set at 1 .0. and the 
ratios in the various tumor cells were standardized accordingly. 
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Example 4: Chromosome 1 gene CH1-9a11-2 

One Of the cDNA obtained through the selection procedures of Examples 1 and 2 
correspondedto a gene that mapped to Chromosome 1 . 

Table4summarizes the results of the analysis for gene duplication and RNA overabundance 
Both quantitative and qualitative assessment Is shown. The numbers shown were obtained by 
companng the autoradiograph intensity of the hybridizing band in each sample with that of the 
controls. Several control samples were used for the gene duplication experiments, consisting of 
different preparations of placental DMA. The control sample with the highest level of Intensity was 
used for standardizing the other values. Other sources used for this analysis were breast cancer cell 
lines with the designations shown. For reasons stated in Example 3. the quantitative number is not a 
direct indication of the gene copy number, although It is expected to rank in the same order Similarly 
up to 6 control samples were used for the RNA overabundance experiments, consisting of differeni 
preparations of breast cell organoids which had been maintained briefly in tissue culture until the 
expenment was perfom^d. The control sample with the highest level of intensity was used for 
standardteing the other values. Each cell line was scored * or - according to an arbitrary cut-off value 
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Norma} 

BT474 

ZR-75-30 

MDA453 

MDA435 

SKBR3 

600PE 

MDA157 

MCF7 

DU4475 

MDA231 

BT20 

T47D 

UACC812 

MDA134 

CAMA.1 



Incidence 
(%) 



+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 



1.00' 

2.70 

2.65 

2.86 

3.72 

1.86 

1.72 

1.49 

1.95 

2.02 

1.23 

1.09 

1.05 

0.67 

1.19 

1.02 



9/15 
(60%) 



+ 
+ 



1.00*' 
1.57 
nd 
5.79 
0.89 
0.94 
4.47 
1.08 
nd 
1.13 
1.47 
0.83 
nd 
1.57 
5.04 
2,51 



7/12 
(58%) 



+ 
+ 



11/12 
(92%) 



Gene duplication or RNA overabundance; . no duplication or ovB»hiinrf>nr«. ^ . . 



1.9 

nd 
1.8 
7.1 
7.2 





TABLE 4: Chromosome 1 Gene in 
Breast Cancer Cell Lines 


Source 


CH1-9a11-2 

Gene 
Duplication 


CH1^a11-2 
RNA Overabundance 
^'^^^ 4Akb 





1.0** 


+ 


3.7 




nd 


+ 


6.2 




2.4 




2.9 


+ 


6.8 


+ 


1.4 




nd 


■f 


1.5 



o 
c 

E 
< 



Th, gene oo„e,f^ «» CH,.9a, 1-2 cDNA „as dup,^ M ou, of IS (60%, of »e 

«. 1.6 b«« ta„h. 6' »K1 0,*, CONA flagmen. ,SEQ. ,D NO:,, . i„ P ^ 
«^™»-«^h=™togv to en. ,„Oe„e.nk. 0«, o, «,e Z««e 
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The CH1.9a11.2 gene ^ ft,*er che^oetort ^„ „ 

.nfo^aton. * ,^t,0 cDNA librae .^e breas, cancer ce» ,™ BT474 ,Ex.n^^, 

using ,he l„«al cDNA N^^ and a ctone «h a 2.5 *base The 

5 cDNA inserts were used as initial sequencing primers: 

T7prinien (SEQ. ID NO:42) 

5'-TAATACGACTCACTATAGGGAGA-3' 
Sp6 (ximer. (SEQ. ID NO:43) 

S'-CATACGATTTAGGTGACACTATAG^' 
sequencing continued by walking along the region of interest t,y standan, techniques using 
Z Rgure f " ''^^ "^""""^ ^ "^^^^ ' 

r,n^^ '^'"''' '^ °" "'^ «^ Obtained using 

CLO^^CHMarathon-cD^IAAmp.i«ca«onKit.An«psho^ngtheover.ap^^^^ 
^ F^ure 6. Briefly, two DNA primers designated CHIa and CH1b (Figure 7) were synthested 
Pdyadenyteted RNA from breast cancer cell line eOOPE was reverse transcribed using CHib primer 
After se«nd strand synthesis, adaptor DNA provWed in the kit was ligated to the double-stranded 
CDNA. The 5- end cDNA of CH1.9a11.2 was then amplified t,y PCR using primers CHla and API 
(provided in the kit). To increase the specificity of the PCR products, the first PCR products were 
PGR reamplified using nested primers CHIa and AP2 (provWed in the Wt). The PCR products were 
cloned into pCRII vector (Invitrogen) and screened with CH1-9a11-2 probe. 

The sequence of 3452 base pairs between the 5' end of pCHM.I and the polyVV tail of CH1- 
9a1 1-2 was determined by standard sequencing techniques. The DNA sequence is shown in Figure 
8 (SEQ. ID N0:15). THe longest open reading frame is in frame 1 (bases 1-1876). and codes for 624 
ammo ackte before the stop codon. The corresponding amino acid sequence of this frame is shown 
.n the upper panel of Figure 9 (SEQ. ID N0:16). The partial sequence predicted for the translated 
protein is fisted the low panel of Figure 9 (SEQ. ID N0:17). Bases 1876 to the end of the sequence 
a.B believed to be a 3' untranslated region. A hydrophoblcity analysis identified a putative membrane 
.nserton or membrane spanning region at about amino adds 382:400. indicated in Figure 9 bv 
underiining. ' 

Figure 23 is a listing of additional cDNA sequence obtained for CH1^ii-2 comprising 
approximately 1934 base pairs 5' from the sequence of Figure 8. The additfonal sequence data was 
obtamed by rescuing and amplifying two further fragments of CH1-9a11.2 cDNA. Nested primers 
were designed -100 base pairs downstream from the 5' end of the known sequence. The primers 
were used in a nested amplification assay using API and AP2. using the CLONTECH Marathonn* 
CDNA AmplMcatlon Kit as descnbed above. The template for the first upstream fragment was 
reverse-tTBnscribed polyadenylated RNA from breast cancer cell line 600PE . as described eariier. 
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This fragment was sequenced, and another set of nested primers was designed. The template for the 
next upstream fragment was a Marathon™ ready cDNA preparation from human testes, also supplied 
by CLONTECH. 

The nucleotide sequence shown in Figure 23 comprises an open reading frame through to 
the 5- end. Figure 24 shows the corresponding protein translation. Between about another 500-1000 
bases are predicted to be present in the CH1.9a11.2 direction, with the protein encoding sequence 
beginning somewhere within this additional sequence. Sequencing of the encoding region is 
completed by obtaining additional CH1.9ai 1-2 fragments in this direction 

A^ENlNFO^BLASTsearchofnucteotideandpeptdesequencedatabaseswasperf^^^ 
through the National Center for Btotechnology ,nfom,ation on February 23. 1996. Short segments of 
homology with other reported human sequences were found at the nucleotide level (<500 base pairs) 
but none with any ascribed function in the respective idenfifier. At the amino acid level, no Identity 
higher than 30% was found with any reported eukaryotic sequences. 

A CH1-9a11-2 cloned insert has been used to probe the level of relative expression in 
pdyadenylated from a panel of tissue sources. The RNA was obtained already prepared for 
Normem blot ana^sis (CLONTECH Catalog * 7759-1. 7760-1 and 7756-1.) The manufacturer 

folTn T.T """""^^ 2 pg Of po^-A RNA per lane, run on a denaturing 
formaldehyde 1-2% agarose gel. transfer^d to a nylon membrane, and fixed by UV irradiation The 
relative CHI-gal 1-2 exprossion observed at the RNA level is shown in Table 5: 
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CH1-9a11.2 mRNA 


heart 


++ 1 


brain 


~ T 4 


placenta 


^ H 


lung 


■ ^ ^ 

ZL^ 


liver 


+/- 


skeletal musde 




kidney 


+/— 


pancreas 




spleen 


4. 

r 


thymus 


+ 


prostate 




testis 





ovary 


++ 


small intestine 


+ 


colon 


+/- 


peripheral blood 


+/- 


++++ Very high 
+++ High 
++ Medium 
+ Low 
+/- Verykw 



Re^at^ely elevated levels of exp^ssion were obsenred In heart placenta, pancreas, prostate testis 
and ovary. The level of expression in breast cancer cell lines is also relativeV high (at«ut on 
scale), since the Northern ana^sis performed on these lines (described above) was conducted on 
total cellular RNA. of which polyadenylated RNA constitutes on^ about 5%. It is likely that the CH1- 
^11-2 gene is involved in a biological process that Is typical to the tissue types showing medium to 
high levels of expression, which may relate to increased tissue growth or metabolism. 

Smce the obtained sequence is shorter than the apparent size of mRNA observed in 
Northern analysis (Table 1). an additional polynucleotide segment is believed to be present at the 5' 
end of the sequence shown in SEQ. ID Nai5. Further sequence data at the 5' end is deduced by 
Obtaining additional cloned cDNA using standart techniques. Briefly, in one approach. mRNA from 
breast cancer cell lines MDA453 and/or 600PE are ctoned and screened using primers based on 
sequence data from SEQ. ID N0:15. Two nested primers of about 20 nucleotides are prepared the 
mnemiost about 150 base pairs f«,m the S" end. and the outemwst about 170 base pairs fiom the 5' 
end. The outem«>st primer is used to synthesize a first cDNA strand complementary to the mRNA in 
the upstream direction. Second strand synthesis is penbm«d using reagents in a CLONTECH 
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Marathon- cDNA amplification Icit according to manufacturers directions. The double-stranded DNA 
IS then ligated at the 5' end of the coding sequence with the double-stranded adaptor fragment 
provided in the kit A first PCR amplification (about 30 cycles) is perfomied using the first adapter 
pnmer ftom the kit and the outemiost RNA-specific primer, and a second amplification (about 30 
5 cycles) IS perfomned using the second adapter primer and the innemiost RNA-specific primer m an 
alternative approach, a CLONTECH RACE-READY single-stranded cDNA from human placenta is 
PCR amplified using nested 5' anchor primers in combination with the outemiost and innemwst RNA- 
specific Primers. Amplified DNA obtained using erther approach Is analyzed by gel elect«,phoresls 
and Cloned into plasmid vector pCRII. Clones are screened, as necessary, using the 2.5 kitobase' 
CH1-9a11-2 insert. Clones con-esponding to ftjIHength mRNA (4.5 kb or 5.5 kb; Table 1) or cDNA 
fragments overiapping at the 5' end are selected for sequencing. Compared w«h the 4.5 kb form 
additional polynucleotide segments may be present in the 5.5 kb fbm, within the encoding region or In 
the 5' or 3' untranslated region. 

15 Examples: Chromosome 8 fi9n9CHi-2a13-1 

One of the cDNA obtained corresponded to a gene that mapped to Chromosome 8 Figure 2 

Shows the Southembtotanalysisfbrthecorresponding gene in various DNA digests. Lane 1 (Pi2)is 
the control preparation of placental DNA; the rest show DNA obtained from human breast cancer cell 
20 lines. Panel A shows the pattern obtained using the 32P^abeled CH8-2a13.1 cDNA probe Panel B 
shows the pattern obtained with the same blot using the 32P-labeled D2S6 probe as a toading confrd 
The sees of the restriction fragments are indicated on the right 

Figure 3 shows the Northem blot analysis for RNA overabundance. Lanes 1-3 show the level 
Of expresston in cultured nom«l epithelial cells. Lanes 4-19 show the level of expression m human 
breast cancer cell lines. Panel A shows the pattern obtained using the CH8-2a13-1 probe- panel B 
shows the pattern obtained with beta-actin cDNA. a toading control. 

T»e results are summarized in Table 6. The scoring method is the same as for Example 4. 
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Source 



Normal 

SKBR3 

ZR-75-30 

BT474 

MDA157 

MCF7 

CAMA-I 

MDA361 

MDA468 

T47D 

MDA453 

MDA134 

MDA435 

600PE 

UACCai2 

MDA231 
DU4475 
BT468 
BT20 



Incidence 
(%) 



TABLE 6: Chromosome 8 Genes 
In Breast Cancer Cell Urns 



CH8-2a13.1 
Gene Duplication 





1.00* 


+ 


4.25 




0.0^ . 


+ 


1.53 


+ 


2.02 


+ 


1.84 


+ 


3.62 


+ 


2.00 


nd 




+ 


1.41 




1.83 


+ 


1.30 


+ 


2,15 




0.95 




1.25 



•CH8-2a13wi 
RIM bwrabuhidance 



0.80 
0.85 
0.37 
0.95 



12A17 
(71%) 



+ 

nd 
+ 
+ 

+ 
+ 
+ 

+ 
+ 
+ 
+ 
+ 
+ 



1.00*' 
4.30 

1.72 
3.39 
4.92 
2.14 
1.74 
4.50 

1.58 
3.10 
3.70 
4.94 
2.04 
2.40 

1.28 
0.88 
0.70 
0.82 



14/17 
(82%) 



Genepupliciition 



+ 
+ 
+ 
+ 

+ 
nd 
nd 



1.00* 

4.73 

2.24 

1.76 

1,39 

3.10 

1.61 



1.02 
0.90 
0.88 
1.00 
0.54 
0.74 

1.27 
0.50 
0.23 



7m 

(44%) 



Gene duplication or RNA overabundance; . no duplication or overabundance- nd » not donA 
D^reeof^neduplicationisrepomKlreta^ 

SrriSi X^-"" ^ HIshSrKL^ several cu«ures of 



> 

o 
o 

< 

LU 

CD 



10 



15 



The gene corresponding to CH8-2a13-1 showed clear evidence of duplication in 12 out of 17 
(71%) of the cells tested. RNA overabundance was observed in 14 out of 17 (82%). Thus.11%of 
the cells had achieved RNA overabundance by a mechanism other than gene duplication. 

Since the known oncogene c-myc is located on Chiomosome 8. the Southern analysis was 
ateo conducted using a probe for c-myc. At least 2 of the breast cancer cells showing duplication of 
the gene corresponding to CH8-2a13-1 gene did not show duplication of c^nyc This irxJicates that 
the gene conBsponding to CH8-2al3-1 is not part of the myc ampllcon. 

The sequence of 150 bases from the 5' end of the cDNA fragment is shown in Figure 22 
(SEQ ID N0:3). There was no substantial homology to any known gene in GenBank. One of the 
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Tl- CH«-2a13.1 gene «as lu*«, characftod by obtaining addiSonal s«,«™» 

5 »«ned u.„, .be ,n«a, ^ ^ , ^ „ Jt^^l iT 
The to idenlised ctenea wem suboloned into plasn*. vacto, pCRII 77 ^ T 

^a n^^ « ^ by atandan. ^ J2 

obtained. The two in8afla»wel«indte<»artaD(Ftau« 61 pk ^ on aala already 

10 In Figure 10. °»"BP ffigufa 6). Pnmefs used are those designaled 1-25 

A tbW don, o< about 600 bp (deslgnaad pCH8«») o«d.pp(„ on the S' end ,F»u,» », 

rrnr:r:-rre^'r:ri^^^ 
». to^^r"^ — - — ^ . sho«„ . 

-n~»-ren«eh.™nn,y^ee.,«. ««Aom has ^1^1^^ T" 
*sc,1b«.byNonu™e,alasb.»,g,«,u^exp™,sed "° """^"^ « 
A tomb C.0,* 0. ab«. 600 bp oy^tapplng pCHMOO « 8, ^ ^ 

Bn«^, a Df« p„n»r was s,nth.si«« cotrespondlnj to abo« the lira. 20 

P«»=« COIM s^tuanca, and used atong a pnn», based on the ™ r 
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nucleotide sequence of CH8-2a13-1 cDNA is shoWn in Figure 13 (SEQ. ID N0:21). The 
corresponding amino acid sequence of this firame is shown in Figure 14 (SEQ. ID NO-22) A 
polynucleotide comprising the compiled sequence is assembled by joining the insert of this fourth 
clone to pCH8^k within the shared region. Briefly, CHS^k is cut w,th Xba\ and Wofl The fourth 
clone is cut with BamHI and Xtel. The ligated polynucleotide is then inserted into pCRll cut with 
SamHI and /Vofl. 

A CH8-2a13.1 cloned insert has been used to probe the level of relative expression in 
polyadenylated RNA from a panel of tissue sources obtained from CLONTECH, as in Example 4 
The relative CH8-2a13-12 expression observed at the mRNA level is shown in Table 7: 



TABLE 7: Northern btot analy«i8 



Tissue 


CHi-9a11-2 mRNA 


heart 


++ 


brain 


+ 


placenta 


+ 


lung 


+ 


liver 




skeletal muscle 


+/- 


kidney 


+/- 


pancreas 


+/- 


1 spleen 


+ 


1 thymus 


+ 


1 prostate 


+ 


testis 




ovary 




small intestine 


+ 


colon 


+ 


1 peripheral l>lood 




'•■+'^'*" VeryhigT 

+++ High 

Medium 
+ Low 
+/- Verykw 





Relative levels of expression observed were as follows: Low levels of expression were observed in 
adult peripheral blood leukocytes (PBL). brain, placenta, lung, liver, skeletal muscle, kidney, and 
pancreas. Medium levels of expression were observed in aduR heart, spleen, thymus, prostate, testis 
ovary, small intestine, and colon. High levels of expression were obsened in four fetal tissues tested.' 
brain, lung, liver and kidney. The level of expression in breast cancer cell lines is relatively high 
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(about .... on the scale), since the Northern analysis performed on these lines was conducted on 

^irr r ' " '"'"'^^'^ '"^^^^ ^ -^at . 

^pcal to the fssue types showing medium to high levels of expression, which may relate to increased 
tissue growth or metabolism. o increased 

Example 6; Chmmosome 13 gene CH13-2a12'1 

^ i™s. Pane, A *e pa*. «^ «e CH13^,«., cONA p«e; p«. b sJI 

:r::ar""'''""'"'"^- — — ~t 

Flgur. 6 sho»s »» No™,«n bto. analysis fb, RNA omabundance of tha CH13.2a12.1 «„e 
1^ 1-3 Show .h. ,e«, « exp,«.,o„ h. ^ epi«,e«a, cafc. lanes 

cance, « ^. ^ a s.o»s ,ne pane. o«a^ ^sAt 

believed to occur at sizes of about 3.2 and 3.5 kb. 

The results of the RNA abundance comparison are summarized in Table 8. The scorina 
method is the same as for Example 4. mesconng 
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Source: 



Normal 

600PE 

BT474 

SKBR3 

MDA157 

CAMA-1 

MDA231 

T47D 

MDA46d 

MDA361 

MDA435 
MDA134 
DU4475 
MDA453 



TABLES: Chromosome 13 Gene 
In Breast Cancer Cell Lines 



CH13-2a12.1 
Gene duplication 



CH13.2a12-1 
RNA Overabundance 




10 



The gene corresponding to CH13.2a12-1 was duplicated in 7 out of 16 (44%) of the cells 
tested. Three of the positive cell fines (600PE. BT474. and MDA435) had been studied previously by 
comparative genomic hybridization, but had not shovw, amplified chromatin in the region where CH13- 
2A12-1 has been mapped in these studies. 

RNAoverabundancewasobsen,edin13outof16(81%)oftheceBlinestesled Thus 37% 
of the cells had achieved RNA overabundance by a mechanism other than gene duplication. 
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Cells from primary breast tumors have also been analyzed them for duplication of the 
Chromosome 13 gene. Ten of the 82 tumors analyzed (12%) were positive, confim^ng that 
dupiicalion of this gene is not an artifact of in vitro culture. 

. .0 .o J'' ''''' '^^^^ "^^^^ « '^<>^ in Figure 

22 (SEQ ID N0:5). THere was no substantial homology to any known gene in GenBank One of the 
three possible reading frames was found to be open, with the predicted amino acid sequence shown 
m Figure 22 (SEQ ID NO:6). 

The CH13-2a12-1 gene was further characterized by obtaining additional sequence 
.nfom,at.n. A X-GT10 cDNA library t^m the breast cancer cell line BT474 (Example 2) was 
10 screened using the init«l cDNA insert and clones with a 3.5 kilobase and a 1.6 kilobase insert were 
^entified. The ^.o identified clones were subcfoned into plasmw vector pCRM. 17 and Sp6 primers 

for regions flanking the CDNA inserts were used as initial sequencing ^mers. Seq^^^^ 
y walk, ng along the region of interest by s^ndard techniques, using seq^n^ng pri J base^^^ 
^ta already obtained. The inserts were found to overtap (Figure 6). P,^ .sed duri 
15 sequenang are shown in Figure 15. ^ 

J* """^ °' «» 3 nuclefc add sequence of 

3339 base p«„ be»«», fte S' end p«,.A B» Of CH13.2a12., was detennined The DNA 

^y>^ ^ ^ -6 (SEQ. ID NO^^ Bases ,^ a« .e^ ,0 be a 5' un«„^ 
f^. l^«»«"top«,«dln,»a«islnl»ame2.™„ba»,521to1838,andoodesfbr8,, 
- me upper pane, of Figure ,7 (SEQ. ,D N024,. The se,^ fbrfM ,«.««ed pn«ein is 

tower panel of Flspu™ ,7 (SEQ. ,D NM5). Bases 1638 to 3339 of «» ™deolMe 
sequence are be«e.ed » be a 3' unfnms^ed -ejlon. wnfcn Is prasen, in fhe 3.5 tt. Ins«l The 3 5 «. 

in the sequence. 

*°^Nf<»="STsean*ofnuc«0Meandpep.,dese,uenced«a6ases»aspe,,^ 
*a Nate™, c«*r ^ Bi«ad„^, ^ 

«ncn,«,hanyasc*ed»««nin.he,a.p««.eid««,^^ 

o- ~n. ■con^.n, oe. pr^i^n- . «s con«x. ™a,„ a„ abn^ „ J 
« 21^^"* " " ^l-* - a i^her „ iower rate Of 

rabbi. iodne, medu«. (Bumatowska^iledin e, al). VACM-I has , «„.™rt>,ane 



-59- 



wo 97/38085 



PCT/US97/05930 



sequence, whereas none has been detected in CH13-2a12-1. Nevertheless, it is possible that the 
CH13-2a12-1 protein product has a Ca** binding or Ca** mobilizing function. 

A CH13-2a12-1 cloned insert has been used to probe the level of relative expression in 
polyadenylated RNA from a panel of tissue sources obtained from CLONTECH. as in Example 4 
The relative CH1 3-2a12-1 expression observed at the mRNA level is shown in Table 9- 



10 



TABLES: Northern blot analysis 


Tissue 


CH13-2a12-1 mRNA 1 


1 icai I 


++++ 1 


ui ail 1 


+ 


placenta 


++ 


lung 


+ 


liver 


++ 


skeletal muscle 


++++ 


kidney 


+ 1 


pancreas 


++ 1 


spleen 




thymus 




prostate 


++ 


testis 


+++ 


ovary 




small Intestine 


++ 


colon 


+ 1 


peripheral btood 




++++ Very high 1 
+++ High I 
++ Medium 1 
tow 1 
Verykw 1 



15 



Relatively elevated levels of expression were observed in heart, skeletal muscle arKi testis 
The level of expression in breast cancer cell lines is relativeV high (about ++++ on the scale) since 
the Northern analysis perfoimed on these lines was conducted on total cellular RNA. It is likely that 
the CH13-2a12-1 gene is involved in a biologkal process that is typical to the tissue types showing 
medium to high levels of expression, which may relate to increased tissue growth or metabolism 

Fragments corresponding to the CH13-2a12.1 gene have also been used to screen cell lines 
dem^ed from other types of cancer. Southern analysis showed that about 1 out of 4 breast cancer cell 
l.nes tested have gene duplication of CHl3.2a12.1. Northern anaVsis showed that about 3 out of 6 
lines tested have overexpression of the corresponding RNA transcript. 
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Example 7: Chromosome 14 gene CHIA-lal^l 



OneofthecONAobtainedconespondedtoagenethatmappedtoChromosome 14 Results 
5 Of the analysis are summarized in Table 10. The scoring method is the same as tor Example 4. 



10 



Source 



Incidence 

(%) 



TABLE 10: Chromosome 14 Gene 
In Breast Cancer Cell Lines 



CH14-2aie.1 
:Gene.dupllcatiOn 



Normal 




1.00* 


BT474 


j + 


2.89 


MCF7 


+ 


1.35 


SKBR3 


+ 


2.58 


T47D 




2.28 


MDA157 


+ 


1.52 


UACC812 


+ 


2.23 


MDA361 




0.97 


MDA453 1 


+ 


1.58 


BT20 






600PE 




0.94 


MOA231 


+ 


1.66 


CAMA-1 1 




0.92 


DU4475 




0.87 


BT468 




0.46 


MDA134 




0.77 



8/15 
(S3%) 



CH14.2a16.1 
RNA Oveiabundance 



1W12 
(83%) 





1,00 


+ 


2.57 


+ • 


1.88 


+ 


2.19 


nd 




+ 


2.52 


nd 






1,43 




5.92 




1.07 


+ 


2.00 


+ 


2.19 




0.71 




1.33 


nd 




+ 


7.17 



a. 
O 

c 
< 

CO 
UJ 

CD 



15 



The gene corresponding to CH14-2a16-1 was duplk:ated in 8 out of 15 (53%) of the cells 

(SEQIDN0.7). There was no substantial homobgy to any known gene in GenBank One of the 

th^possibteread.g frames was feund to be open. v.th the pred^ amino ac^sCue^^^^ 
inFigure22(SEQIDNO:8). « sequence shown 
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The CH14-2a16-1 gene was further characterized by obtaining additional sequence 
mformation. A X-GT10 cDNA library from the breast cancer cell line BT474 (Example 2) was 
screened using the initial cDNA insert and two clones were identified: one with a 1.6 kb insert and 
the other with a 2.5 kb insert. The identified clones were subcloned into plasmid vector pCRII ' The 
1.6 kb msert was sequenced by using T7 and Sp6 primers for regions flanking the cDNA inserts as 
.nrtral sequencing primers. Sequencing continued by walking along the region of interest by standard 
technKjues. using sequencing primers based on data already obtained. Primers used are those 
designated 1-11 in Figure 18. 

A third ctone (designated pCH14^00) overlapping on the S' end (Figure 6) was obtained 
using CLONTECH Marathon- CDNA Amplification Kit. Briefly. DNA primers CH14a. CHl4b CH14c 
and CH14d (Figure 18) were prepared. Polyadenylated RNA ficm breast cancer ceH line MDA453 
was reverse transcribed using 14b primer. After second strand synthesis, adaptor DNA provided in 
the kit was ligated to the double-stranded cDNA. The S' end cDNA of CH14.2a16.1 was then 
amplified by PGR using primers CH14b (or CH14c) and AP1 (provided in the kit). To increase the 
spedficity Of the PGR products, the first PCR products were PGR reamplified using nested primers 
CH14a (or CH14d) and AP2 (provkled in the kit). The PCR products were cloned into pCRII vector 
(Invitrogen) and screened with CH14-2a16-1 probe. 

By sequencing pCH14-1.6 and pCH14-800. a nudefc acid sequence of 2021 base pairs 
between the 5' end and the poly^ tail of CH14-2a16-1 has been determined. The DNA sequence is 
Shown in Rgure 19 (SEQ. ID NO:26). The longest open reading frame is in frame 1 ftom base 1 to 
792. and codes for 263 amino acids before the stop codon. The corresponding amino ackl sequence 
of th.s frame is shown in the upper panel of Figure 20 (SEQ. ID N027). The partial sequence 
predicted for the translated protein is shown in the lower panel of Figure 20 (SEQ. ID NO-28) The 2 1 
kb ctone has not been sequenced, but is believed to consist about the same region of the 
25 CH14.2a16.1 cDNA as pCH14-1 .6 and pGH14-800 combined. 

A GENINFO® BLAST search of nucleotide and peptide sequence databases was perfbm^d 
through the National Center for Biotechnology lnfom«tion on (Warch 26. 1996. Short segments of 
homology with other reported human sequences were found at the nucleotide level (<500 base pairs) 
but none with any ascribed function In the respective identifier. At the amino acid level, the sequence 
was found to shara homologies within the first 106 resklues with an RNA binding protein from 
Sacc/,a/Dmyces ce/evfefee with the designation NAB2. NAB2 Is one of the major proteins associated 
With nuclear polyadenylated RNA In yeast cells, as detected by UV Bght^duced cross^inWng and 
-mmunofiuorescence. NAB2 is stmngly and spedflcally associated with nuclear poly(A)* RNA in vivo 
Gene knock-out experiments have shown that this protein Is essential to yeast cell survival 
(Anderson et al ). Accordingly, the protein encoded by CH14.2a16.1 !» suspected of having DNA or 
RNA binding activity. 

A fburth done (pCH14-1.3) has been obtained that overiaps the pCH14-800 done at the 5' 
end (Figure 6). The method of isolation was similar to that for pCH14-800. using primers based on 
the PCH14-800 sequence. Partial sequence data for pCH14-1.3 has been obtained by one- 
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directtonal sequencing from the 5' and 3' ends of the pCH14-l.3 clone. Figure 21 shows the 
nucieotKle sequence of the sequence of the 5' end (SEQ. ID NO:29) and the amino acid t/anslatton of 
the likeV open reading frame (SEQ. ID NO:30); the nucleotide sequence of the 3' end (SEQ ID 
N0:31) and the likely open reading frame (SEQ. ID NO:32). This data is confimied and additional 
sequence between SEQ. ID NOS.29 and 31 is obtained by fully sequendng both strands of pCH14- 
1.3. Once compiled, the sequence data from pCH14-1.3. pCH14^00 and pCH14-1.6 may be shorter 
than the apparent size of mRNA observed in Northern analysis (Table 1). if necessary further 
sequence data at the 5' end Is deduced by obtaining addittonal ctoned cDNA according to approaches 
described in this Example or Bample 4. 

Figure 25 is a listing of additional cDNA sequence obtained for CH14-2a16-1 comprising 
approximate^ 1934 base pairs 5' from the sequence of Figure 19. The corresponding amino add 
translaton is shown in the upper panel of Figure 26. The addffionai sequence data was obtained by 
rescuing and amplifying further fragments of CH14-2a16-1 cDNA. Nested primers were designed 
-100 base pairs downstream from the 5' end of the known sequence. The primers were used in a 
nested ampliffcatton assay using API and AP2. using the CLONTECH Marathon- cDNA 
Amplification Kit as described above, "me temp^te was a Marathon™ ready cDNA preparation from 
huown testes, also supplied by CLONTECH. 

The nudeoUde sequence shown in F^u« 25 is dosed at the the 5' end. The lower panel of 
Figure 26 shows what is predicted to be the sequence of the gene product, beginning at the first 
methionine residue. The nucleotide sequence shown contains a point difleiBnce at the posl«on 
.dicated t^ the underiining in Figure 25. A base detem,ined to be A from the prevtously obtained 
P^ynucleo^de fragment wasaGintheone used in this part Of theexperiment This co^^^^ 

F auT^e r '"'^""^ '° ° ^'''"^ "^"^^ ^ ^ -'^'ned in 
figure 26. Ttus may represent a natural alMlcvariafon. 

A CH14.2.ie.1 doned insert has be« «ed to p^ote level of reiaihe exp««on m 
P^ny,««l RNA fbm . panel sources oPtalned »on, CLO^.TECH, as n, LS, 

Th. CHI^j.,^, .«p^ ob-rva. a. the mRNA te^lls sho«„n Ta«e ,1 
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TABLE 11: Northern blot analysis 


.1 


Tissue 


CH14-2a16-1 mRNA 






+ 




brain 


+ 






+ 






+ 




liver 


+ 




skeletal miJcHA 


+ 




1 rvivii icy 


+/- 










+ 


thymus 


+ 


II prostate 




II testis 


++++ 


ovary 


+ 


small intestine 


+ 1 


colon 




peripheral blood 


+/ 


1 +-H-+ Very high 

+++ High 
++ Medium 

1 +/- Very tow 



CH14-2a16-1 mRKA was particularty h^h in testis. The level of expression in breast cancer 
cell lines Is also quite high, since the Northern analysis performed on these lines was conducted on 
total cellular RNA. It is likely that the CH14-2a16.1 gene is Involved in a biological process that Is 
typical to the tissue types showing medium to high levels of expression, which may relate to increased 
tissue growth or metabolism. 

Five motifs corresponding to a zinc finger protein have been found in the CH14-2a16-1 
nucleotide sequence. Further zh^c finger motifs may be present in CH14-2a16-1 in the upstream 
dmection. Zinc finger motifs are present, for example, in RNA polymerases I. II. and ill from S 
cetBvisiae. and are related to the zinc knuckle f&mily of RNA/ssDNA-binding proteins found in the HIV 
nudeocapskl protein. The actual sequence observed in each of the five zinc finger motifs of 
CH14-2a16.1 is: 



Q£MXaa)r-ajS-(Xaa)4-2j£jHXaa)Hdi5 or (SEaiDNO:38) 
CXS-(Xaa)5-Cjts-<Xaa) s-£ja-(Xaa)s4iiS (SEQ. ID NO:39) 



wo 97/38085 



PCT/US97/05930 



««. » »«caed », F^u« 20 by u««n»^^ This is toto, ,o 7 zinc «„ger mollfs of NAM 
•^m^upan RNAfcsDNA binding «gion (Ande„on e, ai ,^ AccordingV. »e CH,«a,6., .ene 
^ . suspected 0, ha^ng DNA 0, RNA binding a**, and be spe.r» for p,.yadenyL 
R.*.»™,,eo-we,p,a,a„*i„,^e^«„„^^,^„^^^^^^ 

.«RNA.^m«u«nRNA, «e«,po«o,n,RNA^^„^to^.^^ 

p™t»,. This role », to, may be ,*»e,y inipiicated in ce. gwMb 0, p™«eralion, 

manifest in tumor cells. i^uwny as 

Sfamp/o 8: Identification ofotttereaneer-assoelatadgmes 

cDNA fragments corresponding to additional cancer-associated genes are obtained by 
applying the techniques of Examples 1 & 2 with appropriate adaptations. As before, cancer ceHs 
selected for use in differentia, display of RNA. based on whether they share a duplicated 
Chromosomal region according to Table 12: 



Ghromosomal 
location 

1p22-32 

1p22 
1p32-33 

1q21-22 
1q24 
1q31 
1q32 



2p24-25 
2 
2q 
2q33»36 
3p22-24 
3q24-26 



TABLE 12: Cancer ceil lines sharing duplicated chromosomal regions 



3q25>26 



Cancer type & iBferences 

small cell (Levin1994) 
bladder (Kallioniemi 1995) 

^^^^^^ (Stellen-Glmbel); breast (Ried 1995)- 
small cell lung (Rjed 1994) ' 

sarcoma (Forus 1995a &b); breast (Muleris 1994a) 

small cell (Levin 1994) 

bladder (Kallioniemi 1995) 

glioma (Muleris 1994b; Schrock) 

head and neck (S pelcher 1995). breast (Muleris 1994a ) 
small cell limg (Ried 1994) 
small cell lung (Levin 1994) 
head and neck (Spelcher 1995) 
head and neck (Spelcher 1995) 
head and neck (Spelcher 1995) 
bladder (Voorter). small cell (Levin 1994) 

bladder (Kaltoniem1 1995), glioma (Kim), osteosarcoma (Tarkkanen) 
ovarian (Iwabuchi) 



o 



CD 

m 
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TABLE 12: Cancer ceil lines sharing duplfcated chromosomal regions | 






Chromosomat 
location' 


Cancer type & references P| 


3q26-temi 
3q 


head and necK (Speicher 1 995) 

small cell lung (Levin 1995; Rerid 1994); head and neck (Speicher 1995) 


4q12 


glioma (Schrock) 


5p15.1 


small cell lung (Levin 1994 & 1995; Ried 1994) 
glioma (Mulens 1994b) 


6p 
6p21-temn 


osteosarcoma (Foms 1995a); breast (Ried 1995) 1 
melanoma (Speicher) | 


7p 
7p1M2 
7q21-32 
7q21-22 
7q33-temfi 

7 


glioma (Schlieqel 1994 & 1996- mav be ^GFR^ 

glioma (Muleris 1 994b; Schrock), small cell lung (Ried 1 994) 

glioma (Kim; Muleris 1994b; Schrock) 

head and neck (Speicher), glioma (Schrock) | 
head and neck (Speicher 1 995) | 

colon (Schlegel 1995); glioma (Kim), head and neck (Speicher)- 1 
prostate (VIsakorpi) 


8q 
8q21 
8q24 
8q22-24 
8q24-25 
1 8q23-temi 
1 8q24 


small cell lung (Ried 1994) 

bladder (Kallioniemi 1995) 

myeloid leukemia (Mohamad) 

glioma (Kim; Muleris 1994b); breast (Muleris 1994a) 

small cell (Levin 1994; Ried 1994); breast (Muleris 1994a) 

sarcoma (Foms 1995a), melanoma (Spefcher) 

ovarian (IwabuchI) 

breast (Ried 1996; Isola; Muleris 1994a). small cell lung (Levin 1994 & 1995) B- 
ceineukemias (^nt2 1994a), myeloid leukemia (Bentz 1994b). glioma (Schlegel) 
head and neck (Speicher 1995). prostate (Cher, Visakorpi) ^ ^' 


9 

9p 

1 9p13 


head and neck (Speicher) | 
head and neck (Speicher) 
glioma (Muleris 1 994b) 
breast (Muleris 1994a) 


1 10p 
1 10p13-14 
1 10q22 


head and neck (Speicher 1995) 
bladder (Voorter) 

breast (Muleris 1994a) 


1 11q13 


head and neck (Speicher 1995). breast (Muleris 1994a) 
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TABLE 12;-Caiice^cetl lines sirtring dupHoited chromosdmal 



'Chromosomal 
•location 

12 
12p 
12q 
12q12-15 
12q21.3-22 



17 

17p11-12 

17q 
17q21.1 
17q22-23 
17q22-24 



18p11 



19q13.1 



20p 
20q 
20q13.3 



Cancer type & references 

B-cellleukemias (Bentz 1995a) 

head and neck (Speicher 1995). glioma (Schrock) 

glioma (Schtegel 1994) 

bladder (Voorter). osteosarcoma (Tarkkanen). liposarcoma (Stiljkerbutjk) 
liposarcoma (Suijkerbu ijk) 
colon (Schtegel 1995) 

breast (Ried 1995). head and neck (Speicher 1995) 
bladder (Kallioniemi 1995) 

head and neck (Speicher 1995). small cell lung (Ried 1994^ 
head and neck (Speicher 1 995) 
breast (Mulerls 1994 a) 
head and neck (Speicher 1995) 
breast (Ried 1995) 
breast (Mulerfs 1994a) 



head and neck (Speicher 1995) 
osteosarcoma (Foms 1995a; Tarkkanen) 
breast (Ried 1995), small cell lung (Ried 1994) 
breast (Muleris 1994a) 
bladder (Voorter). breast (Muleris 1994a) 
breast (Kallioniemi 1994) 



bladder (Voorter) 



small cell lung (Ried 1994) 



22q 

22q1M3 



X 
Xq 
Xq24 
Xq11-13 



head and neck (Speicher 1995) ' ~ 

ovarian (Iwabuchi). colon (Schlegel 1995), breast (Isola; Tanner) 
breast (Muleris 1994a), Kallioniemi (1994^ 



head and neck (Speicher 1995) 
bladder (Voorter), glioma (Schrock) 



prostate (Visakorpi) 
small cell lung (Levin 1995) 
small ceil (Levin 1994) 

prostate (Visakorpi), osteosarcoma (Tarkkanen) 



Control RNA 
experiment. Normal 
neoplastrc cells in the 
5 cDNA conresponding 



IS prepared from nonnal tissues to match that of the cancer cells In the 
tissue is olrtained from autopsy, biopsy, or surgical resection. Absence of 
control tissue is confinr^ed. if necessary, by standard histological techniques 
to RNA that is overabundant in cancer cells and duplicated in a proportion of 
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the same cells is characterized further, as in Examples 3-7. Additional cDNA comprising an entire 
protein-product encoding region is rescued or selected according to standard molecular biology 
techniques as described elsewhere in this disclosure. 
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SEQ ID NO: 10: 
llllllllll TAC 

SEQ ID NO: 11: 
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SEQ ID NO: 12: 
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SEQ 10 NO: 13: 
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AGCCAGCGAA 
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What is claimed as the invention is: 

1. An isolated polynucleotide comprising a linear sequence of at least 10 nucleotides identical to 
a Irnear sequence contained in a polynucleotide selected from the group consisting of CH8- 
2a13-1, CH13-2a12-1. CH14-2a16-1. and CH1-9a11-2. 

2. An isolated polynucleotide comprising a linear sequence of at least 40 consecutK^e 
nucleotides at least 90-^ Identical to a linear sequence contained in a sequence selected 
from the group consisting of SEQ. ID N0:15. SEQ. ID N0:18. SEQ. ID N0 21 SEQ ID 
NO:23. SEQ. ID NO:26. SEQ. ID NO:29. SEQ. ID N0:31.. SEQ. ID NO:33. and SEQ ID 
NO:35. but not in any of SEQ. ID NOS: 1 , 3, 5. and 7. 



15 3. 



The isolated po^nucleotde of claim 2. comprising a linear sequence of at least 100 
consecutive nucleotides at least 90% identical to a sequence contained in the selected 
sequence. 

4. 7I« isolated polynucleotide of daim 2. comprising a linear sequence of at least 40 
consecutive nudeotkJes at least 95% identical to a sequence contained in the selected 
sequence. 



5. An isolated polynucteotkle comprising a linear sequence of at least 40 consecutive 
nucleotides that hybrid^es with a DNA having a sequence selected from the group consisting 

25 Of SEQ. ID N0.15, SEQ. ,D N0:18. SEQ. ID NO:21. SEQ. ID NO:23. SEQ. ID NO-26 SEQ 

ID NO:29. SEQ. ID N0:31.. SEQ. ID NO:33. and SEQ. ID NO:35: under conditions where it 
does not hybridize with SEQ. ID NOS: 1. 3. 5. 7. or any other DNA from a human cell. 

6. The isolated polynucleotide of dalm 5. wherein the linear sequence is at least 100 
consecutive nucleotides 

7. An isolated polynudeofide comprising a sequence of at least 40 consecutive nucleotides ««t 
hybridtees With an RNA having a sequence selected fiom the group consisting of SEQ ID 
NO:15. SEQ. ID N0:18. SEQ. ID N0:21. SEQ. ID NO:23. SEQ. ID NO:26. SEQ ID N0 29 
SEQ. ID N0:31.. SEQ. ID NO:33. and SEQ. ID NO:35: under conditions where it does nol 
hybridize with SEQ. ID NOS: 1. 3. 5. 7. or any other RNA from a human cell 



-75- 



wo 97/38085 

PCT/US97/05930 

8. The isolated polynucleotide of claim 7, wherein the linear ■ sequence is at least 100 
consecutive nucleotides 

9. The isolated polynudeotide of any of claims 2-8. wherein said linear sequence is contained in 
3 a duplicated gene or overabundant RNA in cancerous cells. 

10. The isolated pdynucleoUde of any of claims 2-8. which is a CH13-2a12-1 polynucleof de and 
IS corrtained in ar, encoding region for a protein or RNA molecule that controls cell 



10 



25 



35 



proliferation. 



11. The isolated polynucleotide of any of claims 2-8. which is a CH14.2a16-1 polynucleotide and 
IS confined in an encoding region fbra protein with DNAor RNA binding activity. 

12. The isolated polynucleotide of any of claims 2-8. present in a recombinant plasmid depostted 
15 under ATCC Accession No. 98074 

13. The isolated polynucleotide of any of claims 2-8. present in a recombinant phage deposited 
under ATCC Accession No. 97595. 

20 14. The isolated polynudeotide of any of claims 2^. present in the XBCBT474 CDNA library 
deposited under ATCC Accession No. 97594. 

15. An isolated polynucleotide comprising a linear sequence of polynucleotides essentially 
identK:al to a sequence selected from the group consisting of SEQ. 10 N0:15 SEQ ID NO- 
18, SEQ. ID NO;21. SEQ. ID NO:23. SEQ. ID NO:26. SEQ. ID NO:29. SEQ. ID NO:31 SEQ 
ID NO:33. and SEQ. ID NO:35. 



16. An isolated polypeptide comprising a linear sequence of at least 5 amino add residues 
Identical to a sequence encoded by a polynudeotide seleded from the group consistng of 

30 CH1-9a11.2.CH8-2a13-1.CH1^2a12-1.andCH14.2a16.1. 

17. An isolated polypeptide comprising a linear sequence of at least 5 consecutive amino adds 
identical to a linear sequence contained in a sequence seleded from the group consisting of 
SEQ. ID NO:17. SEQ. ID NO:20. SEQ. ID NO:22. SEQ. ID NO:24. SEQ. ID NO-28 SEQ ID 
NO:30. SEQ. ID NO:32. SEQ. ID NO:34. and SEQ. ID NO:37: but not in any of SEQ ID 
NOS: 2. 4, 6, and 8. 

18. The isolated polypeptide of daim 17. comprising a linear sequence of at least 15 consecutive 
amino adds at least 90% identical to a linear sequence contained in the seleded sequence. 
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19. The isolated potypeptide of claim 17 or 18, wherein said linear sequence is encoded in a 
duplicated gene or overabundant RNA in cancerous cells. 

20. The isolated polypeptide of claim 17 or 18. which is overexpressed in cancerous cells. 

21. The isolated po^rpeptKle of claim 17 or 18. wherein the polynucleotide selected from said 
group .saCH1-9a11-2pc.ynucteo.ide. and the po^peptideisatransn^mhrane^^^^^^^ 

22. AnisolatedpolypeptKlecomprisingalinearsequenceofamlnoacidses^^^^^^ 
sequence setected from the group consisting of SEQ. ID N0:17. SEQ. ID NO-20 SEQ^D 
NO:22. SEQ. ,0 NO:24. SEQ. ID NO:28. SEQ. ID NO:30. SEQ. ,D NO:32. SeT.D ^^^^ 
andSEQ.lDNO:37;butnotinanyofSEQ.IDNOS:2.4.6.and8. ' 

" cl" l" r""*" """" ^""^ - -^-P- - -V Of 

24. A monoctonal or isolated polyclonal antil»dy specific for the polypeptide of claim 22. 

25. A method of detecting gene duplication in cancerous cells, comprising the steps of 

a) reacting DNA contained in a clinica. sample with a regent comprising the 
polynucteotde of claims 2-8. said c«nK:a. sam^e having been obta^ed fl an 
individual suspected of having cancerous cells; and 

b) comparing the amount of any complexes fomied between the reagent and the DNA in 
*e c«n.cal sampte with the amount of any complexes fbm«d between the reagent and 
DNA in a control sample. 

suspected of havmg cancerous cells; and 

RNA in a control sample. yiuana 
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27. A method of determining gene duplication or overabundance of RNA in cancerous cells, 
comprising the steps of: 

a) amplifying DNA or RNA in a clinical sample with a primer comprising the polynucleotide 
of claim 2-8 to yield an amplified polynucleotide, said clinical sample having been 
obtained from an individual suspected of having cancerous cells; and 

b) comparing the amount of polynucleotide amplified from the DNA or RNA with the 
amount of polynucleotide amplified from DNA or RNA from a control sample. 

28. A method of screening lor cancer associated with a gene duplicatbn In an Individual, 
comprising the steps of: 

a) determining gene duplication in cells fix>m the individual according to the method of claim 
25; and 

b) correlating any gene duplication detemiined in step a) with an increased risk for the 
cancer. 

29. A method of screening for cancer associated with overexpression of RNA in an individual, 
comprising the steps of: 

a) determining overexpression of RNA in cells from the individual according to the method 
20 ofclaim26:and 

b) correlating any RNA overexpression determined In step a) with an increased risk for the 
cancer. 

30. A method of screening tor cancer associated with a gene duplication or overexpression of 
25 RNA in an individual, comprising the steps of: 

a) detemiining gene duplication or overexpression of RNA in cells from the individual 
according to the method of claim 27; and 

b) correlating any gene duplicatton or overexpression of RNA determined in step a) with an 
increased risk Ibr the cancer. 
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31. The method of any of claims 28-30, which is a screening method for breast cancer. 

32. A diagnosBc k« for detec«ng gene duplication or RMA overabundance in cells contained in an 
.nd.vidua, as man«.st in a clinical sample, comprising a reagent and a buffer in su«ab,e 
packagmg. wherein the reagent comprises the polynudeotide of any of claims 2-8. 

33 ^-»-^forde.ec.ngalteredp«,teinexpressionlncance,ous^^^^^ 

a) react^ a po^pep^de contained in a clinical sample with a reagent comprising the 
antbody of Cim 24. said clin^, sampte rr^^ ^„ J J 
suspected of having cancerous cells; and 'naiviauai 

b) comparing the amount of any complexes formed befcveen the reagent and the 
poVpeptKle ,n the clinical sample with the amount of any complexes formed between the 
reagent and a polypeptide in a control sample. 

A diagnostio ki, for detecting a polypeptKle present in a clinical sampte. comprising a reagent 
and a buffer .n suitable packaging, wherein the reagent comprises the antibody of daim 24 

A host ceU genetically altered by the polynucleotide of any of claims 2 to 8 or claim 23. 

A method of screening a phamiaceutical candklate. comprising the steps of: 

a) separating pn:>geny of the cell of daim 35 into a first group and a second group- 

b) •reabngthefirstgroupofcellswiththephannaceuticalcandidate- 

c) "ottreatingthesecondga>upofcellsw.ththepham«ceuticalcai«iidateand 

d) comparing the phenotype of the treated cells with that of the untreated cells. 

dalT^Tr"' """^ '''''''' ^"'''^^'"9 the po^nucleo^de of 
^ r "'"^ --"^ - — Of 

30 

pharmaceutacal preparation of daim 37. 

35 'l^'""'^"*'"' '^''^''^ '^^ ^ tfe-Py. comprising the antibody of claim 24 

35 '««P^P«««onbe.ngcapableof,edudngthepalhologyofcance«H.sc^^ 

40. A method for treafing an indi^dual bearing cancerous cells, comprising administering the 
Phannaceutical preparation of daim 39. 



34. 

35. 
36. 
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41. 



30 



A pharmaceutical preparation comprising the polypeptide of claim 17 of 18 in an 
immunogenic form, and a pharmaceutically compatible exapient 



5 42. A method for treatment of cancer, comprising administration of the pharmaceutical 
preparation of daim 41 . 

43. A method for obtaining cDNA corresponding to a gene that is duplicated or overexpressed 
in cancer, comprising the steps of: 

^° a) supplying an RNA preparation from control cells; 

b) supplying RNA preparations from at least two different cancer cells; 

c) displaying cDNA corresponding to the RNA preparations of step a) and step b) such that 
different cDNA corresponding to different RNA in each preparatton are displayed 

separately; 

15 d) selecting cDNA corresponding to RNA that is present in greater abundance in the 

cancer cells of step b) relative to the control cells of step a); 

e) supplying a digested DNA preparation from control cells; 

f) supplying digested DNA preparations from at least two different cancer cells; 

g) hybridizing the cDNA of step d) with the digested DNA preparations of step e) and step 
20 f); and 

h) further seteding cDNA from the cDNA of step d) corresponding to a gene that is 
duplicated in the cancer cells of step 0 relative to the control cells of step e). 

44. The method of claim 43. wherein the two different cancer cells used to supply RNA in step 
25 b) share a duplicated gene in the same region of a chromosome. 

45. The method of claim 43. wherein RNA preparations from at least three different cancer 
cells are supplied in step b). 



46. The method of claim 43. wherein the three different cancer cells used to supply RNA in 
step b) share a duplicated gene in the same region of a chromosome. 



47. The method of daim 43. wherein the control cells of step a) are uncultured. 



35 48. 



The method of claim 43. further comprising supplying a digested mitochondrial DNA 
preparation; hybridizing the cDNA of step h) with the digested mitodiondrial DNA 
preparation; and further selecting cDNA from the cDNA of step h) corresponding to genes 
that do not hybridize with the digested mitochondrial DNA preparation. 
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10 



15 



20 



35 



49. The method of claim 43, further comprising the steps of: 
i) supplying an RNA preparation from control cells; 

j) supplying RNA preparations from at least two different cancer ceils; 
k) hybridizing the cDNA of step h) with the RNA preparations of step ii and step j) and 
I) further selecting cDNA from the cDNA of step h) corresponding to RNA that is present in 
greater abundance in the cancer ceUs of step j) relative to the control cells of step i). 

50. The method of claim 49. wherein the gene to which the cDNA corresponds is not 
duplicated in at least one of the cancer cells used to supply the RNA in step j) relative to 
the control cells of step e). 



51. 



52. 



S3. 



54. 

25 



The method of claim 43. wherein the two differerrt cancer cells used to supply the RNA 
preparations in step b) are breast cancer cells. 

The method of claim 43. wherein the two different cancer cells used to supply the RNA 
preparations in step b) are from a common type of cancer, wherein the type of cancer is 
selected ftom the group cor«isting of lung cancer, glioblastoma, pancreatic cancer colon 
cancer, prostate cancer, hepatoma, and myeloma. 

The method of claim 43. wherein the *.o dlffe^nt cancer ceUs used to supply the digested 
DNA preparations in step f) are breast cancer cells. 

The method of daim 43. wherein the two different cancer cells the digested DNA 
preparations in step 0 are from a common type of cancer, wherein the type of cancer is 
selected ftom the group consisting of lung cancer, glioblastoma, pancreatic cancer colon 
cancer, prostate cancer, hepatoma, and myeloma. 

55. A method tor obtaining cONA corresponding to a gene that is deleted or underexpressed in 
30 cancer, convrising the Steps oft 

a) supplying an RNA preparation from control cells; 

b) supplying RNA proparafions from at least two different cancer cells that share a deleted 
gene in the same region of a chromosome; 

0 displaying cDNA corresponding to the RNA preparations of step a) and step b) such that 
drfferent cDNA corresponding to differont RNA in each preparation are displayed 
separately; and 

d) selecting cDNA corresponding to RNA that is present in lower abundance in the cancer 
cells of step b) relative to the control cells of step a). 
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56. 



57. 



58. 



The method of claim 55, further comprising the steps of: 

e) supplying a digested DNA preparation fiom control cells; 

0 supplying digested DNA preparations from at least two different cancer cells- 

g) hybridizing the cDNA of step d) with the digested DNA preparations of step e) and step 
f)l and 

h) further selecBng cDNA from the cDNA of step d) corresponding to a gene that Is deteted 
in the cancer cells of step 0 relative to the control cells of step e). 

A method (or characterizing a gene that is duplicated or has altered expression in cancer 
comprising obtaining cDNA corresponding to the gene accorting to the method of any of 
Claims 43-56. and then sequencing the cDNA. 

A method of screening a candidate dmg for cancer treatment, comprising obtaining cDNA 
corresponding to a gene that is duplicated or has altered expression h cancer according to 
the method of any of claims 43-56. and comparing the effect of the candidate drug on a 
cell geneucally altered with the cDNA >««th the effect on a cell not genetically altered with 
the cDNA. 
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1: 415 ER 
2: 640 ERNP 
3: 640 ERP 
4: BT 474 
5: CAMA 1 
6: SKBR 3 
7: 600 MPE 
8: MDA-MB-361 
9: MDA-MB-231 
10: MDA-MB-134 
1 1 : MDA-MB-468 
12: BT20 
13: T47D 
14: MDA-MB-157 
15: UACC812 
16: MDA-MB-435 
17: MCF7 
18: MDA-MB-453 
19: DU4475 
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Figure 4 
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{ 2; BT 474 

K 3: MCF 7 

I 4: MDA-MB-157 

I 5: SKBR 3 
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• 11: MDA-MB-231 
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14: MDA-MB-468 
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1: 415 ER 
2: 640 ERNP 
3: 640 ERP 

4: BT 474 
5: CAMA 1 
6: SKBR 3 
7: 600 MPE 
8: MDA-I\/IB-361 
9: MDA-MB-231 
10: MDA-n/IB-134 
11: MDA-MB-468 
12: BT 20 
13: T47D 
14: MDA-MB.157 
15: UACC812 
16: MDA-MB-435 
17: MCF 7 
18: MDA-MB.453 
19: DU4475 
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+ 


strand (sense) 


sequence (5' — >3*) 






Ist base 




1 , 


pchl-t7-lf 


1123 


CGG GAG GTT TP A ClTiT nnr. o 

^ww> \J±A LK^n VjAX VwVJA C 


2. 


pchl-t7-2f 


1437 


GCG CTG PAA fiTA r*a& a Am rtvH 
\.xu o'i/i CAA. AAT TG 


3. 


pchl-t7-3f 


1729 


TCT AAA fiTP PAA n^n r»jvA i^/-^ 

AW A kjiv UA/i VjAU Q«AA GG 


4. 


pchl-t7-4f 


1987 


CAG AAA TTA Tnn ttt" r^rr^Tv t-^r^ 


5. 
6. 


pchl-t7-5f 
pchl-sp6-3fb 


2266 
2684 


CaG GAA GAn ClTiCZ nn^ t^tv 

yjnn. oxlv7 wrvjrA lAA CT 

(T) 

AAA CAT ACA PAA ta^ aniv 

rtx-ri XAA ALA C 


7. 


pchl~sp6-2rb 


2966 


TTG GCA GCG APT rsTa twpoi 

v*wn V7w\j rvVvX wXA 111 G 


8. 


pchl-sp6-lrb 


3283 


CCT GAT TTT ATA r»Aa o/^^ 
^v^A vsAi. Ill AlA oAA GCC CC 




strand ( ant isense ) 




9. 


pchl-sp6-lf 


3302 


vaUl lV-1 AlA AAA TCA GG 


10. 


pchl-'sp6-2f 


2987 


ATT CAA ATA CAG TTG CTG C 


11. 


pchl-sp6-3f 


2705 


TTA GTG TTT ATT GTG TAT G 


12. 


pchl-sp6-4f 


2458 


AGT GTT CAT TTC CAG TGA G 


13. 


pchl-sp6-5f 


2066 


CTT TGT TCT TGG ACT TTA G 


14. 


pchl-t7-3fb 


1748 


CCT TGG TCT TGG ACT TTA G 


15. 


pchl-t7-2rte 


1445 


AAT TTT GTA CTT GCA GCG C 


16. 


pchl-t7-lrb 


1141 


GTC GAT CTG AAA CCT CCC G 


17. 


CHla 


1063 


GTG CCT GTA GCA ACT GGA TGG C 


18. 


CHlb 


1079 


GTC ATG TTG GTC AGC TCT GCC 



wo 97/38085 



PCT/US97/05930 



Figure 8f 



1 GAATACATAT ATAAATGGTX3 TTCAGTTACSA GTTnrrrnn^ ^m^^^. 

AGAAGTATCT GAGTCTgSg SJaSSSJ JSJ TGCCTAAAAT 

ssss 

AAGTAAAACSA AGAAGAACAG TCTcS^I^r ^J^^^ TCATXACCTG 
CAGAGGACAG cScSatt? SSS^ GAGAGGGTTA 

■kaasSto §s?s ?asss sss 

SSJSS 

ATGATATCAA ITTOAAflAS AgSSJ JJSSSS TCTTCCTATG 
TCTCTACAtST TAA(^pr5rtr»ar VoT^lir™^ TCCCACTCAT GAGATCCAAG 

^SJ?? A^SSS 
SSJS ISJ^ 

ATTTCAGCTT GCACAAGTCT GTCCAATCTA mTTOTOGC 
TCAGAAGAGG GCItSc SSJSS AGACAAAAAC 
AATTCATAAA AACTCTAATA S^SS SSSS^ C»CCAAGGAA 
CATCACATAA TCAAAGGMA ^S2S SSSSJT °«^«XTG 
TACAGCAGTC TCGGGA^ Trro^rS^ CATTrGGTGT 
GACM?S SSS SSJS SJJSST ^TACAGAA 
TTQGGGGAGG GAGAAAaSt SmSS^^ ^^^^^^ TTTGAAGGGT 

TCTAccrrrr ta^aagto? i^^^^^ ggcattcaga AArrATCGir 

CT^Sr^^^JJSSS? GCTCAATCTT GGTEAATCAG 

^liSS S?5S? JSSS IJSS 

2351 SSS^ GCTGCAAACA CTACAATCCT 

2401 oiSSS SSSS SSS!S ACAGTCT^ 

2451 TGAACACTTA JS^^JJS SSSS^^ TCACTGGAAA 

2501 CTCTXXSTTAT TTTCACTTAT SSr^ S^H^SS^ GATGGTQATg 
=55X SSc'SS^SSSSSj 



51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 
1951 
2001 
2051 
2101 
2151 
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§1 SSS gSSSS 

2??i SJS SSS ^^^^ -S^™ 

2801 cataSS S??^JJ ^S^^ atataatcac agcctccata 

2851 AaSS^ SSSt SS^S:^^^ CTAATTOGT? 

2901 ATCTCCaOAG gSaTaSI JS?^? I^^^^IS^ TTGGCTACAA 

3?oi 

i SSS SSK 

ssss 

nil TraoCIATO MAGCASGAG GOCClSm SSSSS 

ii ^ssss 
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1 EyiVKWCSVR VaLYRQRSRT ALSKGKDYLV LAOPPLTTBh ttct,™,™ 

1 LSGEUNINI EREABIVVLG CLSsISSd iSSJ^ SS^S5f ^ 

1 QSLLLDITPE INPLPKIEVS ESVEYEAGHI PSPVlPoSf t^tSS^ 

SESFSSIEKP SmEINKVN EUjSSSd SSSSSf S^S^®^ 
ATVPDNEDGE AKMNIADTAK OTLISWDSq oT^l^S: ^^^^^ 
SSS'S^ I^NSTOLG?^ SSS^ 

sfsss s^i iS 
sss EEi i^^^ 

ANGDIKGRKP FTOQRDFSNM CTVySSS S?S^ IKPEEPLHPI 

iSACTSLcaro os^rscm^ aSSSsS S^! sqseesyfcg 

HDIIKGNKEI TVCTFGVW SmiS^ ^S^St^ QTKSGSLPSL 

S iS^ 

851 NSMRKKSLQD LFlSl^^ otSS^ m 5?^^^ SGWITSYLVT 

901 INPPNFHCFV TviS^B^ QIFEKFTF.T RR.PICBGFL .LTF-TYTIN 

951 J^iSi ^Ik^ fi^^°^°^ VSLSsSTS 

noi p.KEv«M«., -SSSS -~ gsg^ 



51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 



51 
101 
151 



SSS? ^^^^^ 

^■LUOTIE ^SS^ SS^^ P^^S 

toi s;sf ^™ 

351 "«aS? il^fS?*^ ««»«n^o 

601 HDIK^ MOiaiKILl OTOSGSLPSL 



wo 97/38085 „^„. 

PCT/US97/0S930 



Figure 10 



+ strand (sense) sequence (5' >3 • ) 

1. pch8-sp6-lf 369 GCT AAG CCA GAG CTA GAG G 

2. pch8-sp6-2f 677 tCT GAT CTT CTG CTG ATT C 



3. pch8-lfa 1238 

4. pch8-2f 1462 



(CTC) 

TCT GAA CTG CCT GAG AGA C 



5 pchslsf .^aI AGC ATT ACA AG 

^. pcftB 3f 1745 TCA TCA AAT GAT CAG AAC C 

7" nchslll ^"^ <3AG AGT TGG TAT CC 

7. pch8-5f 2277 GGA ATA AGG AAA GAG CTT G 

l PcSML ACT CAT ATO CCA ATA CC 

9. pch8-5rb 2849 cCT GAG AGA CAG AAC TGT TC 

i?"nrJ;l"^''K GGA CCC TTC ACT TCC TTA C 

ll.pch8-3rb 3370 GGC CAC CAC TTG TCC TCG G 

f «o xrc 3970 qtA CTG CCT CTC TTA AAT G 

sequence (5'-->3') 

ISpchslf? nil CAG TTA CAG CAC TGT TCT G 

iS.pchS 3r 3360 CCC AGG ACA AGT GGT GGC C 

i5"DCh8"5r llii ^<3G AAG TGA AGG GTC C 

17.pch8-5r 3849 GAA CAG TTC TGT CTC TCA GG 

ll.pSsllfb LS" ?S ^ ^ 

CAA GCT CTT TCC TTA TTC C 

l?:^M:"b i,M "-^ <^ 'cc »o 

1746 TGG TTC TGA TCA TTT GAT G 

23:pcJ;8:il^ lill - CTT GTA A1X3 CTC CCA TTT GG 

^ ° -^238 ,y GTC TCT CAG GCA GTT CAG A 

2l:?ch8M£:2f Hi CAC GTA CAG C 

pcxio ro 612 CAA TGA CCA GTA GCA TAA C 

2l:S|;'"° CAG CAT TTA AGA GAG GCA G 

CCT GTA GCT CTG GCT TAG CAT CC 

28.CH8b 510 CCC CTT CAT TK3A GAT CAT CTA G 
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151 CMTGTTGGA CTTICTAgS S^?^ JS^^S PI^^^GCCTCA 

201 ACSGATTGlTr CCTOTOCSTAA TCCCAtcJS SSSS^ AGCAATCCTA 

301 SJaSI SSS ^SSSiJ^t ^S^<^ GAlSS^ 

351 TGGGAAAGCA IJSS SSSSS S^I^f^ «X:AGAAiS 

401 ATTTCGTCAA AACAACATAG AAATtSSJ? SI^SSSS^F ^AGATCAACSA 

451 AAAGTCSTACA TAAATATATT r^arT^f CAGATlTrAT TTACSCAITrc 

501 AATCAAGGGG TOTATaSS SSSS^ ACAGATATCT AGATCATCTC 

551 AC3ATCGAAAA SSSJ SSJSS S^AACrGTCC TTCTCAATCA 

601 TACTCGTCAT TnaoiaaivJ ^25*GCACT GTACTTATAT GGAGTTATOr 

701 GGACGATATT TCTAAGCTCC SS^S??? IS^SSH Al^rCAAATAT 

751 GTGCCAAAAG ACCATCCAAC SSSJST^f r^S?!^^ AGCCAACCAG 

801 ATCAACGAAT CCTTCATCAG TaS^S^ GCTATTTCCA GAGAGTCCCT 

851 TATTTACAAC cSctctSJ ^SSE^™^ GATCTCATCA 

901 CCCTCGCAAA cSmctccp fS^^^^SHI «rGGAGCAT CGCAGCACAG 

, 951 ICCATCCrrc aSSS aSSIJS^ TCArrCTCTA CTTTCAGOcf 

1001 CTncCAGAT MOTrJ^^ AGAQAGATAG TCGATAAATA 

1051 tagtaSS? JSSJ SS^S SSS^^ acagiSJJ^ 

1101 ACCCTCGArr 'mn^^TT^^ aa^-AAAGCTG CAAAAACTCC TTTAAATAAT 

1151 CaSSSS SS^S SSJS^^f f:AAGCAGAT JSJ^^SSJ 

1201 TAAGGGAGGA SgSg SIS^ GAAGGTTAl^ 

1251 AGA6ACTGCA ATCTTOOCAT CCGAtSS^ fS^S^ C5AACTGCCXG 

1301 AGOCTGTOAC CCAAaSS SSSSS ^S^^?" CAGCAGACTC 

1351 TAACAGACTC TCGGTAcSt C^SSSS ^SSSS^ GACCAGATTC 

1401 ACTGCACAAT TTCAGTrTAT AOT^AS J^^^S?^ GCTGTTAGAT 

1451 AGAAAAGCAA ACC^-lSr aSS^^^? ATGTTCAAGC AAATCCTTTC 

1501 T^CTGAGCT TGCTtTI^ ^SE^P^ GAAAGAGGGT TCGGAGCGGA 

1551 GAgSSSJ SSSS^ ISSS^ TCAAACCCCT aS^^ 

1601 ATTGTCTraA AAtSS? ^S^S^ AGAGAGATCT CAAAAC^ 

1651 aactcataS aSS tgcgggcaga aaaactcxa? 

J^oi aatctgcaS JSSf^ chtcgaaS? 

1751 AATGATCAGA ACPATTAaA* i^;!??^"^^ ACTCQAAAGT TTCTTCATCA 

1801 TCGTlSS cSSiS ^^^^^ GGTTCTCATC ACaSSaS 

1851 ATCAtSSg SS^^ GCITCGCAGT TGATTOACAG TITCAcSor 

SSSS 

1951 TTAATCAGGC AAATCGCCCC ScTOctS SJS^ CTTCTrCGTA 

2001 GQAGAGTTCG TATCCTAIGT G^ail«V^ SS?'^^^^^^ GTACTATICT 

2051 CAICTPTACA iSSJS J^SJSST S^S^^P^ ^CCAGAAAG 

2101 TTGAAGTCCC TACCCGCCTG SS^^ CACGACATTA 

2151 CTAGGCCCAC GATAOGAGG? TCCCaJSS J^S^^f^ CTATCCTCAG 

2201 TACTGAAGGC ATCTTMTCA SJS? JS^S^S?^ TTTCCATITr 

2251 TGGATOCAAA GCAGTTCCTC rSrflr^? TrTGGTTGGC ATCATCAAGG 

2301 OGCGITOCCT JSSSS SS^SS? r^^SSi*^ GCTTCTCaS 

2351 GCXAAGTGAA VlXSmrrr^ «^SSzr^^ ATAITCAACC CTCGAGCCAA 

2401 ga5;sS?S 

2-1 -AccAG^ JJSS sffi ;sS 
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2751 
2801 
2851 
2901 
2951 



§1 SSJSJ SJSSS s^^^^ 

2701 AAACTCATCA GGaSSS SJSSJ IJJS^?"^ 

ACCTTGGCSAA CCirTCGTCT J^^J^ J^^^^^^ AATCCAGACC 
GATTCTAAAA gS^JS aJJJSS? SS^S^ TGTGCITrAT 
TCAGAGACAG AACTCTTCAG SIS^^ TATGTTrCAG AAAATTATCC 
AGTCCCC^ ^SS^ GAAT«nCTC 

CATTQCCAAA ACACftGAAGA TTtcSJJS JJI^^^^ ATTITTCCGC 
3001 AGGTTGGGCA GATC^GATT CTC^SS^ GCTATAATGA 

3051 lATTCTTCTC GGtSS SSSSJS SJSSSS^ IGAATTAAAT 

3201 ss^sjss rt^^T^'^ cAcSss^ 

3251 AAAGCGCTTA ScSSS SJS^ ^^^^"^^^ ACATAACAAC 

3301 AGTTCCCAAA SScSS? SST^t^ CTITCTATrT ITCATXXXrTC 

3351 CCGACCGACC aGTTOM?? ^B«^TGGT CTCCCGAAAA 

3401 GCTCaS^ TToSttcS SSr^^ GTCCTGGGAC TGCTCACTCT 

3451 GCCAGctSt SgOTcSS GCAGCTCCTG GCGCTCAITC 

3501 GMATTCOTG pata^^^ GTACAAGCCA GAAGATACCT 

3551 SS^SJ Sf^^ CTCTTCCTCG SSSJ 

3601 TCATrnXGA tS???S? ^^^I^ TGAAGCACAT GOGCCTAArT 

3651 SJJSS SSS SSS CCTACTTCTT 

3701 GATCAAAAGA AAOTCAf^ ^S^^IJE?^ CCATCACAAA TGAATTTCAA 

3751 KSgSS SSSS SS^^ TGCA-mriT CTOJCTATTA 

-i sE sss 

3- S^^SSSffiSS^ 
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APWRGPADRF FWGGANLSAH LVSSNNICJTP ALRPVNHPQC PGTE-SVRLT 
MLDFLAENNL CGQAILRIVS CX3NA1IAELL RLSEFIPAVF RLKDRADQQK 
YGDIIFDFSY FKGPELWESK LDAKPELQDL DEEFREHNIE IVTRFVIAFQ 
SVHKYIVDLN RYLDDLNEGV YIQQTLETVIj UJEDGKQLLC EALYLYGVML 
LVIDQKIEGE VRERMLVSYY RYSAARSSAD SNMDDICKLL RSTGYSSQPG 
AKRPSNYPES YFQRVPINES FISMVIGRLR SDDIYNQVSA YPLP£HRSTA 
LANQAAHLYV ILYFEPSILH TOQAKMREIV DKYFPDNWVI SIYM3ITVNL 
VDAWEPVKAA KTALNOTLDL SNVREQASRY ATVSERVHAQ VQQFLKBGYL 
REEMVLENIP KL12KXRDCN VAIRWIlflin' ADSACDPNNK RLRQIKDQIL 
TDSRYNPRIL POLLLDTAQF EFILKEMFRQ MLSEKQTKWE HYKKHSSERM 
TELADVFSGV KPLTRVEKNE NLQAWFREIS KQILSLNYDD STAAGRKTVQ 
LIQALEEVQE FHQLESNLQV CQFLAOTRKF LH^IIRTINI KEEVLITMQI 
VGDLSFAWQL IDSFTSmQE SIRVNPSMVT KLWITFUOA SALDLPLLRI 
NQANRPDLLS VSQYYSGELV SYVRKVLQII PESMFTSLLK IIKLOTHDII 
EVPTRLDKDK LRDYAQLGPR YEVAKLTHAI STFTEGlim KTTLVGIIKV 
DPKQLLEDGI RKELVKRVAF ALHRGLIHJP RAKPSELMPK LKELGATODG 
FHRSFEYIQD YVNIYGLKIW QEEVSRIINY NVEQBCNNFL RTKIQDWQSM 
YQSTHIPIPK PTPVDESVTF IGRLCPEILR ITDPKMTCHI DQLNTWDMK 
THQEVTSSRL FSEIQTTLGT PGLNGLDRLL CFMIVKELQN FLSMFQKIIL 
RDRTVQDTLK TLMNAVSPUC SIVANSNKIY FSAIAKTQKI WTAYLEAIMK 
VGQMQILRQQ lANELNYSCR FDSKHLAAAL ENLNKALIAD lEAHYQDPSL 
PYPKEDNTLL YEITAYLEAA GIHNPINKIY ITTKRLPYFP IVNFLFLIAQ 
LPKLQYNKNL GMVCRKPTDP VEWPPLV1X3L LTLUCQFHSR YTEQLLALIG 
QFICSTVEQC TSQKIPEIPA EWGALLFLE DYVRYTKLPR RVAEAHVPNF 
IFDEFRTVL* LFFLLLQWKD CP* IFPPSC3M NLKMKRNSVA HTTAFFLSIM 
GNIRRYE*DI SHGIS^YN-Y CLNHGITCNL YQIKAEHIFV LPLI2iAECNC 
YV^IHLVLCS KELFVQLQIF SKIVLL 
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Figure 12f R) 



ss^ ^™ ^S^^ 

SVHK/IVDLN RYLDmS ^^^SS; fSS^'^ IVTOFYLAFq 
LVIDQKIBGE VBEBl^^ SiSS SSS^ EALVLYGVML 
AKRPSMn.ES YFQRVPlSs F^S^^ SSS^ RSTGYSSQPG 
lANQAAMLYV ILWEPsSh TO^SSSS ^"»^SA YPLPEHRSTA 
VmWEPVKAA S^S^ SSS^ 

REEMVLCNIP KmiSS ^SS^ JJ^f^^ VQQFLKBGYL 
TOSRVNPRIL fSuSa^ RLRQIKDQIL 
'^lADVFSGV I^SJ^ SS^i SSSSJS ««°^SERM 
LIQALEEVQE fHQLESNLQV C^^^ SSSSS? STAAGRKTVQ 
VGDLSFAJgOL IDSFTS^E SS^^ S^^?^ KEEVLITMQI 
NQANRPDLLS VSOY^SGELV ?vS^ KLRAITUOA SALDLPLLRi 
EVPTOLDKD^ SSSJ IIKL??niDlI 

^QLLEDGI RKELvS 't^^ f^^'"^ 

^MRSFEYIQD YVNIYGLKTW ncx^T^. RAKPSELMPK UCELGATODG 
VOSTHIPI^K RnaODWQSM 
TWOEVTSSRL FSEIOtS^ P^i^Sf SS^^ DOLMTOyiMC 
RDRTVOOTLK nWjSSS CFMIVKELQN FLSMFX?KIIL 

VGQMOILRQQ SS^S^ FSAIWO^Kl WTAVLEABIK 

WPKEDotS VeSSS SSJJi ?*I^«ALLAD lEAIffQDPSL 

OFICSTVEQC TSOKiraiPA SuSf'I^fi ff^OFHSR YTEQLLALIG 



wo 97/38085 



PCT/US97/05930 



Figure 13(A) 



AGG GGC 
AGA AGA 
CCG CGC 
GAC AGG 
TCT AAT 
TGT CCA 
GCC GAG 
C3GT AAT 
GCT GTG 
ATC ATA 
AAA CTG 
GAA AAC 
GTA CAT 
GAA GGG 
GAT GGA 
CTA CTG 
CTG GTT 
AAT ATG 
CAA CCA 
AGA GTG 
AGA TCT 
CAT CGC 
CTC TAC 
GAG ATA 
ATG GGG 
GCA AAA 
CAG GCA 
CAG CAA 
AAT ATC 
CGA TGG 
AAA CGC 
AAT CCC 
TTT ATA 
AAA TGG 
GCT GAT 
GAA AAC 
TTA AAT 
ATA CAA 
CTG CAA 
ATG ATC 
ATC GTT 
TCC ATC 
CTC AGA 
CTT CGT 
TAC TAT 
ATC CCA 
ACC CAC 
AGO GAC 
CAT 6CT 



GGA AGT CGG 
GGA CCG CCG 
GCG ACC TCC 
TTC TTT AAT 
AAT ATA CAG 
GGC ACA GAG 
AAC AAC CTC 
GCC ATC ATT 
TTC AGG TTA 
TTT GAT TTC 
GAT GCT AAG 
AAC ATA GAA 
AAA TAT ATT 
GTT TAT ATT 
AAA CAA CTT 
GTC ATT GAC 
TCT TAC TAC 
GAC GAT ATT 
GGT GCC AAA 
CCT ATC AAC 
GAT GAT ATT 
AGC ACA GCC 
TTT GAG CCT 
GTG GAT AAA 
ATC ACA GTT 
ACT GCT TTA 
AGC AGA TAT 
TTT CTA AAA 
CCA AAG CTT 
CTG ATG CTT 
CTT CGT CAA 
AGG ATC CTC 
CTC AAA GAG 
GAG CAT TAC 
GTC TTT TCA 
CTT CAA GCT 
TAT GAT GAT 
GCT TTQ GAA 
GTA TGT CAG 
AGA ACC ATT 
GGG GAC CTT 
ATG CAA GAA 
GCT ACC TTC 
ATT AAT CAG 
TCT GGA GAG 
GAA AGC ATG 
GAC ATT ATT 
TAT OCT CAG 
ATT TCC ATT 



iiiSiiiSiii 
iiiiiiii-ii 

GTA GAC TTa itt ^^'^ ^AA AGT 

CM J^o ^ GAT GAT CTC AAT 

CAG CAA ACC TTA GAA ACT GTG CTT CTC liZ ^« 

i^^"^^^ AGT ACA GGT TAT tcJ S 
AGA CCA TCC AAC TAT CCC GAG AGC TAT Sc 

TAC TTT CCA GAT AAT TGG GTA ACT A^ ^ 
AAT CTA GTA GAT GCT TOO GAA CCT T^C 
AAT AAT ACC CTG GAC CTT TCA ^? 
GCT ACT GTC AGT GAA AGA G^J JJJ S Sl^ 
SAA GOT TAT TTA AGG GAG GAG oS 22 

- A^^ G^ 0^ ^ 

ATC AAG GAC Z ^A IS 

WC CAG CTG CTG TTA GAT ACT GCA SI S 

^ ^ ACT GTA CAA CM 

SAG GTT CAA GAA TTC CAC CAG TTG GAA tSJ S? 
TTT CTT GCC GAT ACT CGA AAG TTT JJI 
AAC ATT AAA GAG GAG GIT CIG Am nA« 
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Figure 13fB) 



TTG QTT 
ATA AG6 
CTG ATA 
AAA GAG 
ATA CAG 
GTA TCT 
CTA AGA 
ATT CCA 
GGt CGA 
TGT CAC 
GAA 6TG 
ACC TTT 
AAA GAG 
GAC AGA 
CCC CTA 
ATT GCC 
AAG GTT 
AAT TAT 
AAT CTC 
CCT TCA 
ACA GCC 
TAC ATA 
TTT TTG 
ATG GTC 
CTG GGA 
CAG CTC 
TGT ACA 
CTT CTG 
GTT GCT 
GTG CTG 
ATC TTC 
GCT CAT 
TAT GAG 
TTA AAT 
ATT TTT 
ATC CAT 
TTC AQT 



GGC ATC 
AAA GAG 
TTC AAC 
TTG GGA 
GAC TAT 
CGT ATC 
ACG AAG 
ATA CCC 
CTC TGC 
ATA GAC 
ACC A6C 
GGT CTA 
TTA CAG 
ACT GTT 
AAA AGT 
AAA ACA 
GGG CAG 
TCT TGT 
AAT AAG 
CTT CCT 
TAT CTG 
ACA ACA 
ATC GCT 
TGC CGA 
CTG CTC 
CTG GCG 
AGC CAG 
TTC CTG 
GAA GCA 
TAA CTG 
CCA CCA 
ACA ACT 
TAA GAT 
CAT GGT 
OTA CTG 
TTA GTT 
AAA ATA 



ATC AAG 
CTT GTG 
CCT CGA 
GCG ACC 
GTC AAC 
ATA AAT 
ATT CAA 
AAG TTT 
AGA GAA 
CAG CTG 
AGC CGC 
AAT GGC 
AAT TTC 
CAG GAC 
ATT GTC 
CAG AAG 
ATG CAG 
CGG TTT 
GCT CTC 
TAC CCC 
GAG GCA 
AAG CGC 
CAG TTG 
AAA ceo 
ACT CTG 
CTG ATT 
AAG ATA 
GAG GAT 
CAT GTG 
TTT TTC 
TCA CAA 
GCA TTT 
ATA TCT 
ATT ACA 
CCT CTC 
TTA TGT 
GTA TTA 



GTG GAT 
AAG CGC 
GCC AAG 
ATG GAT 
ATT TAT 
TAC AAC 
GAT TGG 
ACC CCT 
ATC CTG 
AAC ACT 
CTC TTC 
TTA GAC 
CTC AGT 
ACT TTA 
GCA AAT 
ATT TGG 
ATT CTG 
GAT TCT 
CTA GCA 
AAA GAA 
GCT GGC 
TTA CCC 
CCA AAA 
ACC GAC 
CTG AAG 
GGC CAG 
CCT GAA 
TAT GTT 
CCT AAT 
CTA CTT 
ATG AAT 
TTT CTG 
CAT GGC 
TGC AAT 
TTA AAT 
TCT AAA 
CTA GT 



CCA AAG 
GTT GCC 
CCA AGT 
GGA TTC 
GGT CTG 
GTG GAG 
CAA AGC 
GTG GAT 
CGG ATC 
TGG TAT 
TCA GAA 
AGG CTT 
ATG TTT 
AAA ACC 
TCA AAT 
ACT GCG 
AGG CAA 
AAA CAT 
GAC ATT 
GAT AAC 
ATT CAC 
TAT TTT 
CTT CAA 
CCG GTT 
CAG TTC 
TTT ATC 
ATT CCT 
CGG TAC 
TTC ATT 
CTT CAA 
TTG AAG 
TCT ATT 
ATT AGT 
TTA TAT 
GCT GAA 
GAA CTA 



CAG TTG 
TTT GCC 
GAA TTG 
CAT CGT 
AAG ATT 
CAA GAG 
ATG TAC 
GAG TCT 
ACA GAC 
GAT ATG 
ATC CAG 
CTG TGC 
CAG AAA 
CTC ATG 
AAA ATT 
TAT CTC 
CAG ATT 
CTG GCA 
GAA GCC 
ACA CTT 
AAC CCA 
CCA ATT 
TAC AAC 
GAT TGG 
CAT TCC 
TGC TCC 
GCA GAT 
ACA AAG 
TTT OAT 
TGG AAG 
ATG AAA 
ATG GGA 
TAA TAT 
CAG ATA 
TGT AAC 
TTT GTG 



CTG GAA 
CTG CAT 
ATG CCC 
TCT TTT 
TGG CAG 
TGT AAT 
CAG TCC 
GTA ACG 
CCA AAA 
AAA ACT 
ACC ACC 
TTT ATG 
ATT ATC 
AAT GCT 
TAT TTT 
GAG GCT 
GCC AAT 
GCT GCT 
CAC TAT 
TTA TAT 
CTG AAT 
GTA AAC 
AAA AAT 
CCA CCA 
CGG TAC 
ACG GTG 
GTT GTG 
CTA CCC 
GAG TTC 
GAT TGT 
AOA AAC 
AAC ATC 
AAC TGA 
AAA GCA 
TGT TAT 
CAA CTC 



GAT GGA 
AGG GGA 
AAG CTG 
GAA TAC 
GAA GAA 
AAC TTT 
ACT CAT 
TTT ATT 
ATG ACA 
CAT CAG 
TTG GGA 
ATT GTA 
CTG AGA 
GTC AGT 
TCC GCC 
ATA ATG 
GAA TTA 
CTG GAG 
CAG GAC 
GAA ATC 
AAG ATA 
TTT CTA 
CTG GGA 
CTT GTC 
ACC GAG 
GAG CAG 
GGT GCC 
AGG AGG 
AGA ACA 
CCT TAG 
TCA GTT 
AGA CGT 
TAT TGT 
GAA CAC 
GTA TAA 
CAG ATT 



wo 97/38085 



PCT/US97/05930 



Figure 14fA) 



Arg Gly 
Arg Arg 
Pro Arg 
Asp Arg 
Ser Asn 
Cys Pro 
Ala GIu 
Gly Asn 
Ala Val 
He He 
Lys Leu 
Glu Asn 
Val His 
Glu Gly 
Asp Gly 
Leu Leu 
Leu Val 
Asn Met 
Gin Pro 
Arg Val 
Arg Ser 
His Arg 
Leu Tyr 
Glu He 
Met Gly 
Ala Lys 
Gin Ala 
Gin Gin 
Asn He 
Arg Trp 
Lys Arg 
Asn Pro 
Phe He 
Lys Trp 
Ala Asp 
Glu Asn 
Leu Asn 
lie Gin 
Leu Gin 
Met He 
He Val 
Ser He 
Leu Arg 
Leu Arg 
Tyr Tyr 
He Pro 
Thr His 
Arg Asp 
His Ala 
Leu Val 
He Arg 



Gly Ser Arg 
Gly Pro Pro 
Ala Thr Ser 
Phe Phe Asn 
Asn He Gin 
Gly Thr Glu 
Asn Asn Leu 
Ala He He 
Phe Arg Leu 
Phe Asp Phe 
Asp Ala Lys 
Asn He Glu 
Lys Tyr He 
Val Tyr He 
Lys Gin Leu 
Val He Asp 
Ser Tyr Tyr 
Asp Asp He 
Gly Ala Lys 
Pro He Asn 
Asp Asp He 
Ser Thr Ala 
Phe Glu Pro 
Val Asp Lys 
He Thr Val 
Thr Ala Leu 
Ser Arg Tyr 
Phe Leu Lys 
Pro Lys Leu 
Leu Met Leu 
Leu Arg Gin 
Arg He Leu 
Leu Lys Glu 
Glu His Tyr 
Val Phe Ser 
Leu Gin Ala 
Tyr Asp Asp 
Ala Leu Glu 
Val Cys Gin 
Arg Thr He 
Gly Asp Leu 
Net Gin Glu 
Ala Thr Phe 
He Asn Gin 
Ser Gly Glu 
Glu Ser Met 
Asp He He 
Tyr Ala Gin 
He Ser He 
Gly He He 
Lys Glu Leu 



Gly Leu Thr Arg Ser Arg Ser Gly Thr Ala Asp 
Pro * Gly Arg Gly Gly Asn Trp Val Pro Ala 
Gly Pro Ala Arg Ala Pro Trp Arg Gly Pro Ala 
Gly Gly Ala Asn Leu Ser Ala His Leu Val Ser 
Thr Pro Ala Leu Arg Pro Val Asn His Pro Gin 
* ser val Arg Leu Thr Met Leu Asp Phe Leu 
cys Gly Gin Ala He Leu Arg He Val Ser Cys 
Ala Glu Leu Leu Arg Leu Ser Glu Phe He Pro 
Lys Asp Arg Ala Asp Gin Gin Lys Tyr Gly Asp 
ser Tyr Phe Lys Gly Pro Glu Leu Trp Glu Ser 

?r f/i" ""^^ Glu Phe Arg 

He Val Thr Arg Phe Tyr Leu Ala Phe Gin Ser 

Gin Gin Thr Leu Glu Thr Val Leu Leu Asn Glu 
Leu Cys Glu Ala Leu Tyr Leu Tyr Gly Val Met 
Gin Lys He Glu Gly Glu Val Arg Glu Arg Met 
Arg Tyr Ser Ala Ala Arg Ser Ser Ala Asp Ser 
Cys Lys Leu Leu Arg Ser Thr Gly Tyr Ser Ser 
Arg Pro Ser Asn Tyr Pro Glu Ser Tyr Phe Gin 
Glu Ser Phe He Ser Met Val He Gly Arg Leu 
Tyr Asn Gin Val Ser Ala Tyr Pro Leu Pro Glu 
Leu Ala Asn Gin Ala Ala Met Leu Tyr Val He 
Ser He Leu His Thr His Gin Ala Lys Met Arg 
Tyr Phe Pro Asp Asn Trp Val He Ser He Tyr 
Asn Leu Val Asp Ala Trp Glu Pro Tyr Lys Ala 
Asn Asn Thr Leu Asp Leu Ser Asn Val Arg Glu 
Ala Thr Val Ser Glu Arg Val His Ala Gin Val 
Glu Gly Tyr Leu Arg Glu Glu Met Val Leu Asp 
Leu Asn Cys Leu Arg Asp Cys Asn Val Ala He 
His Thr Ala Asp Ser Ala Cys Asp Pro Asn Asn 
He Lys Asp Gin He Leu Thr Asp Ser Arg Tyr 
Phe Gin Leu Leu Leu Asp Thr Ala Gin Phe Glu 
Met Phe Lys Gin Met Leu Ser Glu Lys Gin Thr 
Lys Lys Glu Gly Ser Glu Arg Met Thr Glu Leu 
Gly Val Lys Pro Leu Thr Arg Val Glu Lys Asn 
Trp Phe Arg Glu He Ser Lys Gin He Leu Ser 
Ser Thr Ala Ala Gly Arg Lys Thr Val Gin Leu 
Glu val Gin Glu Phe His Gin Leu Glu Ser Asn 
Phe Leu Ala Asp Thr Arg Lys Phe Leu His Gin 
Asn He Lys Glu Glu Val Leu He Thr Met Gin 
Ser Phe Ala Trp Gin Leu He Asp.:Ser Phe Thr 
Ser He Arg Val Asn Pro Ser Met :Val Thr Lys 
Leu Lys Leu Ala Ser Ala Leu Asp Leu Pro Leu 
Ala Asn Arg Pro Asp Leu Leu Ser Val Ser Gin 
Leu Val Ser Tyr Val Arg Lys Val Leu Gin He 
Phe Thr Ser Leu Leu Lys He He Lys Leu Gin 
Glu Val Pro Thr Arg Leu Asp Lys Asp Lys Leu 
Leu Gly Pro Arg Tyr Glu Val Ala Lys Leu Thr 
Phe Thr Glu Gly He Leu Met Met Lys Thr Thr 
Lys Val Asp Pro Lys Gin Leu Leu Glu Asp aiv 
Val Lys Arg Val Ala Phe Ala Leu His Arg Gly 
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Figure 14(R) 



Leu He 
Lys Glu 
He Gin 
Val ser 
Leu Arg 
He Pro 
Gly Arg 
Cys His 
Glu Val 
Thr Phe 
Lys Glu 
Asp Arg 
Pro Leu 
He Ala 
Lys Val 
Asn Tyr 
Asn Leu 
Pro Ser 
Thr Ala 
Tyr He 
Phe Leu 
Met Val 
Leu Gly 
61n Leu 
Cys Thr 
Leu Leu 
Val Ala 
Val Leu 
He Phe 
Ala His 
Tyr Glu 
Leu Asn 
He Phe 
He His 
Phe Ser 



Phe Asn Pro 
Leu Gly Ala 
Asp Tyr Val 
Arg He He 
Thr Lys He 
He Pro Lys 
Leu Cys Arg 
He Asp Gin 
Thr Ser Ser 
Gly Leu Asn 
Leu Gin Asn 
Thr Val Gin 
Lys Ser He 
Lys Thr Gin 
Gly Gin Met 
Ser Cys Arg 
Asn Lys Ala 
Leu Pro Tyr 
Tyr Leu Glu 
Thr Thr Lys 
He Ala Gin 
Cys Arg Lys 
Leu Leu Thr 
Leu Ala Leu 
Ser Gin Lys 
Phe Leu Glu 
Glu Ala His 

* Leu Phe 
Pro Pro Ser 
Thr Thr Ala 

* Asp He 
His Gly He 
Val Leu Pro 
Leu Val Leu 
Lys He Val 



Arg Ala Lys 
Thr Met Asp 
Asn He Tyr 
Asn Tyr Asn 
Gin Asp Trp 
Phe Thr Pro 
Glu He Leu 
Leu Asn Thr 
Arg Leu Phe 
Gly Leu Asp 
Phe Leu Ser 
Asp Thr Leu 
Val Ala Asn 
Lys He Trp 
Gin He Leu 
Phe Asp Ser 
Leu Leu Ala 
Pro Lys Glu 
Ala Ala Gly 
Arg Leu Pro 
Leu Pro Lys 
Pro Thr Asp 
Leu Leu Lys 
He Gly Gin 
He Pro Glu 
Asp Tyr Val 
Val Pro Asn 
Phe Leu Leu 
Gin Met Asn 
Phe Phe Leu 
Ser His Gly 
Thr Cys Asn 
Leu Leu Asn 
Cys Ser Lys 
Leu Leu 



Pro Ser 
Gly Phe 



Gly Leu 
Val Glu 
Gin Ser 
Val Asp 
Arg He 
Trp Tyr 
Ser Glu 
Arg Leu 
Met Phe 
Lys Thr 
Ser Asn 
Thr Ala 
Arg Gin 
Lys His 
Asp He 
Asp Asn 
He His 
Tyr Phe 
Leu Gin 
Pro Val 
Gin Phe 
Phe He 
He Pro 
Arg Tyr 
Phe He 
Leu Gin 
Leu Lys 
Ser He 
He Ser 
Leu Tyr 
Ala Glu 
Glu Leu 



Glu Leu 
His Arg 
Lys He 
Gin Glu 
Met Tyr 
Glu Ser 
Thr Asp 
Asp Met 
He Gin 
Leu Cys 
Gin Lys 
Leu Met 
Lys He 
Tyr Leu 
Gin He 
Leu Ala 
Glu Ala 
Thr Leu 
Asn Pro 
Pro He 
Tyr Asn 
Asp Trp 
His Ser 
Cys Ser 
Ala Asp 
Thr Lys 
Phe Asp 
Trp Lys 
Met Lys 
Met Gly 
* Tyr 
Gin He 
Cys Asn 
Phe Val 



Met Pro 
Ser Phe 
Trp Gin 
Cys Asn 
Gin Ser 
Val Thr 
Pro Lys 
Lys Thr 
Thr Thr 
Phe Met 
He He 
Asn Ala 
Tyr Phe 
Glu Ala 
Ala Asn 
Ala Ala 
His Tyr 
Leu Tyr 
Leu Asn 
Val Asn 
Lys Asn 
Pro Pro 
Arg Tyr 
Thr Val 
Val Val 
Leu Pro 
Glu Phe 
Asp Cys 
Arg Asn 
Asn He 
Asn * 
Lys Ala 
Cys Tyr 
Gin Leu 



Lys Leu 
Glu Tyr 
Glu Glu 
Asn Phe 
Thr His 
Phe He 
Met Thr 
His Gin 
Leu Gly 
He Val 
Leu Arg 
Val Ser 
Ser Ala 
He Met 
Glu Leu 
Leu Glu 
Gin Asp 
Glu He 
Lys He 
Phe Leu 
Leu Gly 
Leu Val 
Thr Glu 
Glu Gin 
Gly Ala 
Arg Arg 
Arg Thr 
Pro ♦ 
Ser Val 
Arg Arg 
Tyr Cys 
Glu His 
Val * 
Gin He 
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Figure IS 



+ strand (sense) sequence (5«— >3.) 

1st base 



1. pchl3-sp6-lf 

2. pchl3-sp6-2f 

3. T7.1 

4. T7.2 

5. T7.3 

6. T7.4 



370 
726 

1140 
1361 
1602 
2041 



7. chl3-2480 2486 
- strand (antisense) 



8. SP6.1 

9. SP6.2 

10. SP6.3 

11. SP6.4 

12. pchl3-t7-lf 

13. pchl3-t7-l£a 

14. pchl3-t7-2fa 

15. CH13-AS-1 



2746 
2490 
2213 
1812 

1165 

712 

286 

536 



TTT ACT TCT AAC GCT TAT TC 
TGA AGG AGT CCT TTG AGA CG 

TCA CAA TGG GCT ACT GG 
TTC AAC GAG GGA GAT GG 
TTA GCA CCA CTG AGA GA 
GTT CTT TTA GGC ATT TA 
GCT GCG TCT GTT CGT CAG C 

CCT CTG CTT CAC AAC AT 

GCA GtA GGG CGG ACA CC 
(C) 

AGG GTC TTC TTC ATT GT 
GGA TTG TCT TTG TCT CT 
AGT GCA CTT CCA TGG GCG TG 
CCT TCA TCA GGT TGA CGA AC 
GCG GCA ATC AGA AAC GGA AG 
TGA ACA CGT GGT ACA T 
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Figure ^6(A) 



1 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 
1951 
2001 
2051 
2101 
2151 
2201 
2251 
2301 
2351 
2401 
2451 
2501 



CTTCCCTGAG 
CTCTGCCCTT 
AAATGAAGGC 
AGAAGCATCC 
CAGTAGGTG6 
G-KXyrCGTCC 
GCCTCTAGAC 
ACCAGTCTCT 
TATAGGAAAC 
TTTAACAGCA 
GAGTGCCGGA 
GGGCAGCAGG 
AACAGCGATC 
TGTTGGACTT 
AAGAATGAGC 
CAACAAGAGA 
CAAAGTTAAG 
ACGTTCGACA 
CTTTCAAGCA 
AAAGTGCCTC 
GAGTGCGGTG 
GGAGCTTTCG 
AGAGTGACTC 
TACTGGCXIAA 
TAAACTTCAG 
GAAAACnCA 
TTTAAAGAAG 
GCTCCTCATG 
TGGCCACGGG 
GCCTGTGGCA 
GGAAGATGGA 
TTAGAATAAA 
GTTAGCACCA 
TG CTATC GTC 
TAGTrrCTXSA 
TTGAAAAAGA 
CAAAGACAAT 
GGTICCCCPr 
CTGTGCCATT 
GAAGGAAGOG 
AACCTGCAGA 
GCATTTAAAT 
AAGAAGATGT 
CAAAAAGCTG 
IGAAGAAGAC 
ATCCCTGAAG 
GGCCCTTCAT 
TGGATCACGC 
AAGTTCTAAA 
GCTCCCTGAG 
CAGCTGAGTT 



CCCTTTCTGC 
CTCCGTAAGA 
TTGGGAAGAT 
CTCCTTCCCT 
TTTTTAGAAA 
GTTTGCATGA 
TGCATCTGTC 
TCTTTAAACT 
CACTGATTCC 
ATTCTGCAGA 
CCTCGCACAG 
CGCTGCTQCA 
GTAATCAATC 
CAAGGACAAG 
GGTTCGTCAA 
CCCAACAAGC 
AGCAGGCAAC 
AGAT CATGAT 
TTTTATAAAA 
AGTCGATGCT 
CAGCCTTCAC 
AAGGACATCA 
AGGCCXIATA 
CATACACGCX: 
GAAGTATTTA 
GTGGCAAACT 
GGAAGAAGGA 
TTCAACGAGG 
GATAGAGGAT 
AAGCACGTGT 
GACAAGTICA 
GATCAATCAA 
CTGAGAGAGT 
AQAATAATGA 
ATTATATAAT 
GAATTGAATC 
CCGAATCAGT 
CATGAAACAC 
TCTGGGACTC 
AGGTGGCTCC 
TCTAT CUTIT 

TACIAAAGAG 
CAAGTTTGGT 
CCTAGATGCT 
ACAGCTCGCT 
GGGTGAACAT 
ACCCTA GCCA 
CTTTGGTCGC 
GTCCCAA GGC 
CCTT6TQAAT 



CTGTGTAGGA 
TGGTGCATTA 
GGCTAAAATC 
GGGCCCGCCC 

GGGcrrccrr 

GGAAATGTTC 
ATAGACAAAT 
ITACTTCTAA 
TTGTGTGGAG 
AAGGGCTCGA 
ATGTACCAGC 
GCACTOGAGC 
CTGAGAAAGA 
GTGGACCACG 
CCTGATGAAG 
CTGCAGAACT 
AAAGAAGCCA 
CCTGTTCAGG 
AAGATTTGGC 
GAAAAGTCTA 
CAGCAAGCTG 
TGGTTCATIT 
GACCTCACAG 
CATGGAAGTG 
AGGCATTTTA 
ACTTTGGGAC 
ATTCCAGGTG 
GAGATGGCTT 
AGTGAATTGC 
GCTGA TTAAA 
TTTTTAATOG 
ATTC AGATGA 
GTITCAGGAT 
AGATGAGAAA 
CAGCTGAAAT 
TCTGATAGAC 
ACXIACTACGT 
TAGAATGTAC 
TGATTGATCC 
TGGGTCATCT 
TCCCTCCAGT 
ACTCTGTCCA 
AAGnCCTTT 
TTGTTCTCGT 
GCATnTTTA 
CAGATGATCA 
TAGAAAGAGC 
CT GGCCCCTC 
TCATTTTTCG 
CATCGTOICC 
CTCTO' lVm 



AGCAGAAGGC 
AAACGTTCCT 
AGCAATCCTT 
GTGGGCCTGC 
CAGCGTCATT 

iTAACcrrcc 

GCCCCCATCT 
CGCTTATICT 
AAACAGCTAT 
CCAC OTACIG 
TGTICAGCCG 
GAGTACATCA 
CAAAGACATG 
TCATCGAGGT 
GAGTCCTTTC 
GATCGCAAAG 
CAGACGAGGA 
TTTATCCACG 
AAAAAGACTC 
TCTTGTCAAA 
GAAGGCATGT 
CAAGCAGCAT 
TGAACATACT 
CACTTAACCC 
TCTTCGAAAG 

ATGCTGTrrr 

TCCCTCTIXX 
CAGCTTTGAG 
GCAGAACGCT 
AGTCXXIAAAG 
AGAGTTCAAG 
AGGAAACTGT 
AGACAATATC 
GACTCTTCGT 
TTCCAGTAAA 
AGAGACTATA 
GGCCTGACGC 
CCTCAGAGCA 
AGCTGTGGAC 
TTCAC AAGGC 
TTTTCCTCTA 
AAATAACTTT 
AAAAGGTCTT 
GTGTGATCAT 
GCTCTGAAGA 
GCATTTAGAG 
CAGGGTIC AA 
CCTGITICAT 
TAAOTCAGGT 
GCCCTGCTGC 
GGGGTTGGGG 



GGAATGTCGG 
TATAAACTGG 
GGAATAACGC 
TTGTGCTGTT 
AGCAACAGGA 
GTTTCTGATT 
TTTA CAGAGA 
TTTTACXTTTA 
TAGGAGAACA 
GATGAGAACA 
GGTGAGGGGC 
AGACTTTTGG 
GTCCAAGACC 
CTGCTTCCAG 
AGACGrrCAT 
CATGTGGATT 
GCTGGAGCGG 
GTAA AGATQT 
CITGTTGGGA 
GCTCAAGCAT 
TCAAGGACAT 
ATGCAGAATC 
CACAATGGGC 
CAGAAATGAT 
CACAGTCGTC 
AAAAGCGGAG 
AGACACTGGT 
GAGATAAAAA 
GCAGTCCCTG 
GAAAGGAAGT 
CACAAGTTGT 
TGAGGAACAG 
AGATTGATGC 
CATAATCTTC 
GCCTGGAGAT 
TGGAGAGAGA 
ATCTGCAGAC 
GGAAGCACAC 
ATTGGAAGGC 
TCAAGACTTC 
GTrCTTTTAG 
GAGATTGGAC 
GTTCTTGTCT 
GAGTGCACAA 
TTCCTTAGGT 
TGAAAACAAG 
AOCTGGCGAA 
GTATITCCAA 
TTCTAAGTGA 
GTCTGTTCGT 
CTAGTGTG7T 
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TGTGTTTCCA TTCTAAGATT GAC?rCTCGC<A r-rr^n.,^,^ 
GGGTAACTGC TCTITCATIT TTTtSSS SSfSl "TTGCATTC 
^M^GTT TCGTITCGTr SSJJS SSSSST "^^^TX^CAA 
TCTCTCCTGT AAACTCTAAA AA^SSS S^^i^ CGATCCITCT 
GTCAAGCAC3A GGTTAinTC tcSaaS TCTTGATOTT 
TGGTnTOTG TTOTOTAiaT AtS^S TGITCGTACC 
•TCAGTAGTOA TCTTAGAAGG CTaSStS S^^f^ AAAGGAAAGT 
CATTTAAAAG TACTTTATAT TmSSJ S^S^ TTTGAGATAA 
AAAGCTACCA AAGGAATm ATnTCATTA 
TTCTQCSAATA TACCAAGTIT ATATAaSS JSSSS™ AAGCAATATT 
GAGTCTCTTT TTCAAACATC oSSSJ? S^^^ AAATTATTAA 
CCATA.TAAA A^C^ SSSJ A^^^"^ ™53«KniT 
TCATTTATCA GITCCATCAT A-^^^S^ ArmTATCT rTGAAAATTr 

3--^ TATmrrrr SSS? aacagao^S 

3251 ATTTAATSXA GACTTACTTr GaSaS^ TiSSJf^™' TAATATTClA 

TACATTAATA AAACtS ^CCTTAAAAT 



2551 

2601 

2651 

2701 

2751 

2801 

2851 

2901 

2951 

3001 

3051 

3101 
3151 
3201 
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1 FPEPFLPV»E AEGGMSALPF 

51 EASLLPWARP WACLCCSVGG 

101 PLDCICHRQM PPSFTENQSL 

151 LTAILQKGLD HLLDENRVPD 

201 TAIVINPEKD KEMVQDLLDF 

251 NKRPNKPAEL lAKHVDSKLR 

301 FEAFYKKDLA KRLLVGKSAS 

351 ELSKDIMVHF KQHMQNQSDS 

401 KLQEVFKAFY LGKHSGRKLQ 

451 LLMFT3EGDGF SFEEIKMATG 

501 EDGDKFIFNG EFKHKLFRIK 

551 AIVRIMKMRK TLGHNLLVSE 

601 KDNPNQYHYV A«RICRRFPF 

651 KEGRWLLGHL SQGSRLQPAD 

701 RRCY* REV/PL KGLVLVSKSC 

751 SLKTARSDDQ HLE^KQGPEM 

801 SCKLWWLIFR KSGF^VSSLR 

851 VFPF*D«VWQ SLFFCIGVTA 

901 LCCKL*KVYG DLKS-CCEAE 

951 Q»»C«KGmrD KDTFEITFKS 

1001 SGIYQVYII* FCAKLLRVSF 

1051 HL»VP«yW« ERPNRFLFFF 

1101 TLIKLCEMQM TH 



SVRWCIKTFL INWK.RLGKM AKISNPWNNA 
F'KGLPSASL ATGVWRLHE EMFLTFRF»L 
L.TLLLTLIL FTLYFKPLIA CVEKQLLGEH 
LAQMYQLFSR VRGGQQALLQ HWSEYIKTFX3 
KDKVDHVIEV CFQKNERFVN LMKESFETFI 
AGNKEATDEE LERTLDKIMI LFRFIHGKDV 
VDAEKSMLSK LKHECQAAFT SKLEO^FKDM 
GPIDLTVNIL TMGYWPTYTP MEVHLTPEMI 
WQTTLGHAVL KAEFKEGKKE FQVSLFQTLV 
lEDSELRRTL QSLACGKARV LIKSPKGKEV 
INQI^^IKETV EEQVSTTERV FQDRQYQIEA 
LYNQLKFPVK PGDLKKRIES LIDRD^ERD 
MKH.NVPSEQ EAHLCHFWDS D-SSCXSHWKA 
VSFSLQFFL* FF^AFKLPLL LCAK^L^DWT 
KPGLFSCVIK SAQ»RRP.ML HFLALKIP.V 
GEH^KEPGFK AGEWtflHPSH WPLPVSCISK 
CQGHGVRPAA SVRQLSSL*! S\7LGVGASVF 
L-FFLIAVFV •U}* •SLVWF LQSCAGTILV 
VILWKD^KDF VGTWFCWYI YMRIIISERKV 
TLYFT^-HVS F-LKATKGIL IMA^VFKAIF 
•NMRV^NMTP CGFPY-NPHS LIVIFIFENF 
LISSLCLEIV NIVI-CRLTL NKISLIGLKI 



501 ™T,rr*«^™. ^ MyOLFSR VRGGQQALLQ HWSEYIKTFG 

ll\ ^^^^ KEMVQDLLDF KnKVDHVIEV CFXJKNEOTVN IMCEsStFI 

301 SSSSSJ^^' »^«VDSKLR AGNKEATDEE LERlTi)KIMI LFRFIHoSJ 

?S ^f^^™ KRLLVGKSAS VDAEKSMLSK LKHECGAATT SKLEGMF^ 

Ini ^f^^^ KQHMQNQSDS GPIDLTVNIL IMGYWmTP MEvStPM 
ill ^^1?^ WQTTLGHAVL KAEFTCEGKKE ^vSf?^ 

451 LLMraBGDGF SFEEIKMATG lEDSELRRTL QSLACGKARV T^TJCOwrWr 

III SSSSS^ iNQi^ S^SS 

III J"^^ LWQLKPPVK PGDUOOUES zSb^^ 
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+ strand (sense) sequence (5'— >3.) 

1st base 

: . pchl-l-sp6-l£ 686 GGC TTA ACA CTC AAT GtA C 

2. pchl4-sp6-2f 1005 CTA TOA AAA GAC AGC TTA AG 

3. PChl4-SP6-3£ 1315 ATT TAG TTT GAA AAG CAT G 
4 pchl4-sp6-« 1589 CAG ACT TTA AAG TCA CAA 0 
5. PChM-sp6-5f^ 1808 CAA AGA CTO GOT GTA TAG TO 





strand (antisense) 


sequence 


{5'— >3') 


6. 


pchl4-sp6-6fb 


2020 


GCA GTT TAA TTT GGT CCT G 


7. 


pchl4-sp6-5fb 


1757 


CTG TAA TTA TAG TTC TGT C 


8. 


pchl4-sp6-4fb 


1607 


CTT GTG ACT TTA AAG TCT G 


9. 


pchl4-sp6-3fb 


1339 


ATA ATC ATG CTT TTC AAA C 


10 


.pchl4-sp6-2rb 


1023 


TTA AGC TGT CTT TTC ATA G 


Il.pchl4-sp6-lrb 


704 


GTA CAT TGA GTC TTA AAC C 


12. 


CH14a 


629 


CGG CAG AGC TCA CTA CTC GAA 


13. 


CH14b 


644 


CAA GCA GGG AAG TAA CGG CAG 


14. 


CH14c 


109 


CTT GTT AGC TTG TT, AGA AGG 


15. 




90 


GGT GGA AGA GAA GGT CTC CTT 
TCA GGC 
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J SSftS^'^^ ATTACGGGTC TCGAACAGGA AGCATCTCCA GCAGTCTGTV 

ill ^^^^^ AAGCCTGAAA GGAGACCTIC TCTtcScct TctSJSS 

01 CTAACAAGAA TCTCAITITC AAGGCTATAT CTCAAGCTCA AGAATcSJ? 

SI ACAAAAACAA CTAACTACTC TACAGTTCCA CAGaSSgA cSJJSSS 

01 TGCTCCCAGA ACTCCSAACTr CTCAAGAAGA ATT^^ SS^SS 

51 AGC3GACAAAG TAGGACCCCC AGAATAAGTC CcSaTtS 

101 ACAAAAGGAG ArrCTGTAGA AAAAAAOCAA GCTGMMGA SSJSS^ 

m AAACCAGAAA AACiriTCGA GCGCTOoS SSSSS 

SI ITCGGGATGAG TCTCCCTACC S^SSS? SSSSS 

^^S^^ CCAATTOOA ATTTGCTGAA AAATCTITCT TOSTtcSS 
ll t^i^^S^ TATCATGCftA AGTGTACTAA ACCAGAT^ cSSS^S 

oJ ^^^S^S^ AAGAATTCCA GTACTCTCTC CAAAACCAGT tcScSS" 

1 

M IGAAATGGAT TCGACCTCAA AOCaSgSJ S^S^ 

?i I^S^^^r^ 

J TTTATOATCT GGTmAACA ITCGGTCTTr TTG??SS 

J Hy^?^ TAAGGAAGAG CTAAaSS? SaSSS 

1 TGGGGCATCT TTGTGCACTG CTCTTOTOAG GATCAGCATA tSaaSJ^ 

51 ATCATQGTTA GTCATGGTAC TCCAGCOTAG GGGGcSTJ S^JS^ 
)i GATGCAGIGA GGCAGITCTC S^SS 

>1 ACTTTCACTT TTCCCAAAGA TTATATAATC TTCATAATCC AC^^t^ 

;1 CAGCATTOGC CAAAGGTACT GAGGCTCCTT AAAMAtSS iSS^S^^Sflf^ 

? SSSS^"^ ATATGTATTC TCACATTCTC 

1 GAAATGTAAA AATTAGATTT AAATAGTATA TTmAATGA f^AaT^^I^ 

SSSS^ ICAGATCAGA TAGGTAaJS SS^JS 

1 CTTTTCGCCT ACICTAmc TTACAGAGTT ' lTmt, " lUM r -f^Si^^^ 

I AACTGTTAAG GCAAGAAGTO TCftAMGOT SSrS^ I^F™* 

1 CTGArnCAA AGAOT?^;^ iSJ^^SEF TAACAGATCA 



2001 TCAGGAOCAA ATXAAACTGC T 
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i SSS gSSSJJ SgSS IS^^ 

1 TKGDSVEKNQ ^EIEE^Q SSSS ^^^^"^^ RISPPIKEEE 

KAFPNCKFAi KO^S^ CAVHHPISPC 
APPSSSQUTR YFPACkS P^^r?? S^f^^ VLSPKPVAPP 
RHALKWIRPQ TSE.H^ f^f^ nfHPTINVPP 

S^^^PIY LKCLlFQvS ^^.SSS 
WSMFVHCCCE DQHMKLTSWL VMVLOLRGLH GT^SS* 'GRAKFCNI 
^mrWCOTIM FIIHHENSIG SSS5 ^fr^f^^* ^SCHYSKNCT 
EAECYFR.SS SFPAFCDRMN iSlSS^ S^fv*^ V-KA-LYRPL 
MPSK.FTOET TLSQNnS S^SS S^r*^^ YIVLSPINNI 
FSESTENILS YY.R.FLKCk m t t^^^F DFKVTOL.My ICILTF-KIT 
601 UAYCITV^? ScSvl^ ^^JO^^ ^SJ!^^^ -VNCKIDiJ 

651 O.KSGLMQKG ..RLQH^ ^VKCFRVK .QITOFKDLV YSVKN.SLKG 



51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 



ssssj isr^^ ^^^^ 

TKGDSVEKNQ AEMS^^o SSPf^ EWQGQSRTP RISPPIKEEE 
arP SSSQI ^ JS Yggg^P VLSPKPVAP? 

PHMnirrn iu i .l "'"''^ ffyn pgn cBF my Tgpnra. gygp^^^,. 
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1 


AAAACTTTCG 


GAAGAGAAAG 


51 


AAAAAATTCA 


ATCATGATGG 


101 


GTCTCGAACA 


GGAAGCATCT 


151 


AAAGGAGACC 


TTCTCTTCCA 


201 


TTQAAGGCTA 


TATCTGAAGC 


251 


CTCTACAGTT 


CCACAGAAAC 


301 


CTTCTCAAGA 


AGAATTGCTA 


351 


CCCCAGAATA 


AGTCCXXXXA 


401 


TAGAAAAAAA 


TCAAGATTAC 


451 


ACAAGATCAT 


TTATTCTGAA 


501 


GGCACCAAAC 


CAAGANTCGG 


551 


OnCAGGGAC 


CCTTATGCAG 


601 


GCAAGTCCCA 


AG 



TTOCCTGTGG TAAGTTCAGT TGTTAAAGTA 
AGAAGAGGAG GAAGAAGATC ATGATTACX3G 
CCAGCAGTGT GTCTGTGCCT GCAAAGCCTC 
CCTTCTAAAC AAGCTAACAA GAATCTCATT 
TCAAGAATCC GTAACAAAAA CAACTAACTA 
AGACACTTCC AGrTGCTCCC AGAACTCGAA 
GCAGAAGTGG TCCAGGGGAC AAAGTAGGAC 
TTAAAGAAGA GGAAACAAAA GGAGATTCTG 
TATCACATCG AATCCATGGT CCATCCAGAC 
<3AAGCCAAAG CTGTCTGAGG AAGTANTAGT 
GGATGAAGAC TCCAGATTCC CTTCGGGTTC 
ACACNAGATC TTGTTCAACC AGATAAACCT 



1 KTFGRESCLW •VQLLK^iOIS 

51 KGDLLFHLLN KLTR1»F»RL 

101 LLKKNC-QKW SRGQSRTPRI 

151 TRSFILKKPK LSEEVXVAPN 

201 ASFK 



IMMEKRRKKM MITGLEQEAS PAVCLCLQSL 
yLKL KNP«QK QLTTLQFHRN RHFQLLPELE 
SPPIKEEETK GDSVEKNQDY YEMESMVHAD 
QXSGMKTADS LRVLSGTI/IQ TXDLVQPDKP 



^^ GACGGGNAGN GGAATCNATS GNGGCITCTr CNGAAACNNG 

loi ^^S^?^ NGAGGGGGAC AAGTAGCGGC GTCAITOAGA MAGgSt 

101 gagggtnctc acatcaccnc atctoaocat gncgngccwt ccccantaot 

151 AANANTCATG ATAGNC3QGAA GTCGGCCCAC CCAGAAGoS GATO^SS 

201 CXX5CCACTAN GAAACNNGTr TCTCCANTTA GNCATAoSlA SgtSS^? 

251 CNAGCNGCGT CCCCGGCACC NGCANANNNN CNNCNGGGAC NACNGCCCNN 

301 NNNTONCTTA NNCNGNGNAG NNAAAAAATT CAATCmSt ^SSSS 

351 AGGAAGAAGA TGA TGATTAC GGGTCTTGAA CAGGAAGCAT CtcS^SS 

401 GTGTCTgTGC CTGCAAA wv^van^v^i uxtjcaggagt 



Untitled translated in RF 2 

1 SCSDG3DCM)CW 3aVXICXARWX EGDK.RRDXE BGGEGXHITX SXHXXXSPXX 

QK^IAAASX KX7CPXXHXX XRVXXASPAX aSS^ 
101 XXUCXXXKKF NHDGEEEE ED DDYGSRTOSI SSSUfiVPA '"^^^^ 
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CH1-9a11'2 

GA AAA CAA ATG GAA GAA ATG CAA AAG GCT TTC AAT AAA ACA ATC GTG 
AAA CTT CAG AAT ACT TCA AGA ATA GCA GAG GAG CAG GAT CAG CGG CAA 
ACT GAA GCC ATC CAG TTG CTA CAG GCA CAG CTG ACC AAC ATG ACA CAG 
CTT GTT CAA 

Lys Gin Met Glu Glu Met Gin Lys Ala Phe Asn Lys Thr lie Val Lys 
Leu Gin Asn Thr Ser Arg He Ala Glu Glu Gin Asp Gin Arg Gin Thr 
Glu Ala He Qln Leu Leu Gin Ala Gin Leu Thr Asn Met Thr Gin Leu 
Val Gin 



CH8-2a13-1 

GAA CAG GCA AGC AGA TAT GCT ACT GTC AGT GAA AGA GTG CAT GCT CAA 
GTG CAG CAA TTT CTA AAA GAA GGT TAT TTA AGG GAG GAG ATC GTT CTG 
GAC AAT ATC CCA AAG CTT CTG AAC TGC CTC AGA GAC TGC AAT GTT GCC 
ATC CGA TGG CTG ATC CTT C 

Glu Gin Ala Ser Arg Tyr Ala Thr Val Ser Glu Arg Val His Ala Gin 
Val Gin Gin Phe Leu Lys Glu Gly Tyr Leu Arg Glu Glu Met Val Leu 
Asp Asn He Pro Lys Leu Leu -Asn Cys Leu Arg Asp Cys Asn Val Ala 
He Arg Trp Leu Met Leu 



CH13-2a12'1 

CTC ACA ATC GGC TAG TGG CCA ACA TAC AC6 CCC ATG GAA GTC CAC TTA 
ACC CCA GAA ATG ATT AAA CTT CAG GAA GTA TTT AAG GCA TTT TAT CTT 
6GA AAG CAC AG 

Leu Thr Met Gly Tyr Trp Pro Thr Tyr Thr Pro Met Glu Val His Leu 
Thr Pro Glu Met He Lys Leu Gin Glu Val Phe Lys Ala Phe Tyr Leu 
Gly Lys His jr =u 



CH14-2a16-1 

TC TTT 6TT CAC CCA AAT TCT AAA TAT GAT GCA AAG TGT ACT AAA CCA 
C8AT TGT CCC TTC ACT CAT GTG AGT AGA AGA ATT CCA GTA CTG TCT CCA 
AAA CCA GTT GCA CCA CCA Q 

Phe Val His Pro Asn Cys Lys Tyr Asp Ala Lys Cys Thr Lys Pro Asp 
Cys Pro Phe Thr His Val Ser Arg Arg He Pro Val Leu Ser Pro Lys 
Pro Val Ala Pro Pro • 
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CTCAGAGAGG GCTGCCAGGA CGCGAGCCAC TGAGGAGCCG CTCAGCCAGC 
GCCATAGCCC TTAGGACTAT CGGTCACATT CTCGCGCTCC TGCTCCGGCT 
CCTCCATCTT GGCCTCGGCA GTGGCGGCTG CCGGGAGGAT GTGCCGCCTT 
CTGGCAGGGG GAAGAAGGAG GAGAAGATGA AGAAGCACCG GCGGGCCTTG 
GCCCTGGTCT CCTGCCTCTT TCTGTGCTCT CTGGTCTGGC TTCCCAGCTG 
GCGTGTATGT TGTAAAGAGA GTTCCTCAGC TTCAGCGTCA TCATATTACT 
CTCAAGATGA CAACTGCGCA CTAGAAAATG AAGATGTACA ATTCCAGAAA 
AAGAATACAG AGTCAAAAAA GTTAAGTCCA CCGGTGGTGG AGACACTCCC 
liS^r''*'" '^'^'^ AGTCTTCCAA TGCAGTTGTG GACAGTGAAA 
CTGTTGAAAA TATTTCCAGC TCATCTACCT CAGAAATCAC TCCAATCTC^ 
AAGCTTGATG AAATAGAAAA ATCTGGTACT ATTCCGATAG CCAAACCAAG 
TGAAACTGAG CAGTCTGAAA CTCATTGTGA T6TTGGTGAG GcJJJJgJJg 
CTAGTGCTCC AATTGAACAA CCTTCCTTTG TCAGTCCACC TGACAGCCTT 
GTTGGCCAGC ATATAGAAAA TGTATCATCT TCACATGGTA AAGGAAAgI^ 
JJ^JiJi^^^ GAATTTGAAT CAAAAGTTTC AGCAAGTGAA CAGGGCGGTG 
GTGATCCAAA ATCTGCATTG AATGCTTCAG ATAATTTAAA AAATGAGAGC 
TCTGATTATA CAAAACCAGG AGACATTGAC CCTACATCAG TAGCAAGT?? 
CAAAGATCCA GAAGATATAC CAACATTTGA TGAATGGAAG AAGAAAGTTA 
TGGAAGTAGA AAAAGAAAAA AGTCAGTCGA TGCATGCATC TTCTAATGQA 
GGTTCACATG CCACCAAAAA GGTCCAGAAA AATCGAAATA ATTATGCCTC 
J^^"" <^TGCCAAAA TTCTAGCAGC TAATCCA6AA GCCAAGAGCA 
CATCTGCTAT TCTTATAGAA AATATGGATC TTTACATCTT GAATCCTTGC 
AGCACTAAAA TTTGGTTTGT TATTGAACTT TGTQAACCAA TTCAAGTAAA 
ACAGCTTGAT ATTGCAAATT ATGAATTArT ITCTTCTACT cS^G^ 
TTCTGGTTTC TATCAGTGAC AGATATCCAA CAAATAAGTG GATTAAGCTG 
GGTACTTTTC ATGGTAGAGA TGAGCGGAAT GTACAGAGTT TCCCTTTAGA 
TGAACAGATG TATGCAAAAT ATGTCAAGGT TCAGTTGCTA TCACATTTTG 
GATCAGAGCA CTTTTGTCCA TTAAGCCTTA TAAGGGTATT TCGCACTAAC 
ATGGTGGAAG AATATGAAGA AATTGCTGAT TCCCAGTATC ACTCAGAACG 
CCAGGAACTA TTTCATCAGG ACTATGATTA TCCACTGGAT JaS^TC 
GAGAGGATAA ATCCTCAAAA AATCTTCTTG GTTCTGCTAC AAATGCCATT 
CTAAATATGG TGAATATTGC TGCTAATATT CTGGGAGCAA AAACTGAAGA 
CCTGACAGAA GGAAATAAAA GTATATCT6A GAATGCCACT GCCACAGCTG 
CACCTAAAAT GCCTGAATCA ACTCCTGTTT CAACTCCTGT TCCATCTCCT 
Irll^J^l^ CCACTGAAGT ACACACACAT GACATGGAGC CGTCAACaS 
iniln^ ^ AAAGAGAGTC CCATTGrACA GTTAGTTCAA GAGGAGGAAG 
^^f'=^<' ''^^^'rCTACA GTGACCCTTC TGGGCAGCGG TGAAcS 
GATGAATCAT CACCCTGGTT TGAGTCAGAG ACACAAATAT TTTgSgTCA 
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ACTGACCACA ATTTGTTGTA TTTCTAGTTT TTCAGAATAC ATATATAAAT 
GGTGTTCAGT TAGAGTTGCT CTTTATCGGC AGCGCAGCCG AACTGCTTTG 
AGTAAAGGAA AAQATTATCT TGTGTTAGCT CAACCACCCT TACTACTTCC 
TGCGGAATCA GTAGATGTTT CAGTATTGCA ACCTCTGAGT GGAGAATTGG 
AAAATACGAA TATAGAAAGG GAAGCTGAAA CTGTTGTTCT GGGTGATTTA 
AGTAGTAGTA TGCACCAGGA TGACTTGGTG AATCACACTG TAGATGCAGT 
TGAACTTGAA CCAAGCCATT CTCAAACTCT TTCTCAGTCT CTTctJSaG 
ATATTACCCC AGAAATCAAT CCCTTCCCTA AAATAGAAGT ATCTGAGTCT 

gttgaatatg aggcaggaca tataccatca ccagtgatt? ccS^^^ 

TTCTGTTGAG ATCGATAATG AAACAGAACA AAAGTCTGAG AgSJS^J? 
CTATAGAGAA ACCATCTATT ACCTATGAAA CAAATAAAGT TAATGAGTTA 
ATG6ATAATA TTATAAAAGA AGATATGAAC TCCATGCAAA TTTTCACAA^ 
GCTGTCTGAA ACAATAGTGC CACCAATAAA TACAGCCACT GTACCCGAC^ 
ATGAAGATGG GGAAGCCAAA ATGAATATAG CTGACACAGC AAAgS^S 
TTGATTTCTG TTGTGGATTC TTCTTCATTA CCTGAAGTAA AAGaSJJgI 
ACAGTCTCCA GAAGATGCCC rrrTCAGAGG GITACAGAGG AC^T^i 
ATTTTTATGC TGAATTGCAA AATTCTACAG ATCTAGGATA TGCTAAtSa 
AATCTTGTAC ATGGATCAAA CCAAAAGGAG TCAGTATTTA TGAGA^JS^ 
TAATCGTATT AAAGCCTTAG AAGTTAACAT GTCTCTCAGT GGTCGCTATC 
TGGAGGAGCT TAGCCAAAGG TACCGAAAAC AAATGGAAGA AATGCAAAAG 
GCTTTCAACA AAACAATCGT GAAACTTCAG AATACTTCAA GAATAGCMA 
GGAGCAGGAT CAGCGGCAAA CTGAAGCCAT CCAGTTGCTA ^^SgC 
T^CCAACAT GACACAGCrr GTTrCAAATT TATCAGCAAC 
TTGAAACGGG AGGTTTCAGA TCGACAAAGC TATCTTGTCA TATctS^ 
JSJIIf ^T^TGGGAC TGAT6CTTTG TATOCAGCGT TCtSaAA^I 
CTTCTCAATT TGATGGAGAT TATATTTCAA AACTTCCTAA AAGTAATCAG 
TATCCAAGCC CTAAAAGGTG nrCTCrrCC TATGATGATA TGAATTTGA^ 
AAGAAGAACT TCATTCCCAC TCAT6AGATC CAAGTCTCTA CAGTTAACTG 
GCAAAGAAGT AGACCCAAAT GATTTGTACA TTGTAGAACC CCTCAAGTTT 
TCTCCAGAAA AGAAGAAGAA GCX3CTGCAAG TACAAAATTG AAAAAATTGA 
JtSS?!'^'' CCTGAAGAAC CATTGCACCC CATAGCCAAT G6CGACATAA 
AAGGAAGAAA GCCCTTTACG AACCAGAGAG ATTTTTCTAA TATGGGAGAA 
GTTTATCACT CTTCTTATAA AGGTCCTCCA TCTGAAGGAA GCTcSaSJ 
TTCATCACAG TCAGAAGAGT CCTATTTTTG TGGCATTTCA GCTTgScJa 
T.^lt'^ TGGACAGTCT CAAAAGACAA AAACTGAGAA GAGGGCi™ 
AAACGAAGAC GATCTAAAGT CCAAGACCAA GOAAAATTGA TAAAAACTCT 
^JI^™^'' CATTGCCGAG CCTCCATGAC ATAATCAAAG 

GAAACAAAGA GATCACCGTG GGAACATTTO GTGTTACA6C AGTCTCGGGA 
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CATATCTAAA ATTAATTGAA CTTTTCATAC AGAAGACTTT TTTGTTGTTG 
ll^Jti^"" AACAGTCTGT AGTATTTGAA GGGTTTGGGG GAGGGAGaS 
ATATTAATGG GAAAGGCATT CAGAAATTAT GGTTTCTACC TTTTTaSaA 
GTAGATGGGA TTGTGCTCAA TCTTGGTTAA TCAGCTACAG TTTTAC^ 
CTGATCACTT CCTATAAGGA CAATGGTAGA CATTTTATAA AGATGTTT?? 
TCACAAGATT AATTACTGGG ACAAAAGTAA TTTGGAAGC^ S^cSJI 
GAATGAAAGC CTAAACCTCT TCCTTTAgS SgScSI? 
TTCTTGCACC TTCCCATATT TATGTGCCTT TTGTCTATTT ATAATGcSc 

c??^s?™ r^^^^ ^^^^ ™^ 

SJJ^Sr ^°*AGCTGCA AACACTACAA TGCTTTGAGG GGGTCTGTGC 
CTGAAGCTCA GGAGTGTGGA TCAGACAGTC TAAAGATCCT AAAAACxS^ 
CAACTGGATC TTTGTTTAGC AAACTCACTG GAAATGAaS 
TTTTTAAGTC TGTTCTGTTA GGTAGATGGT GATGCTCtSJ Zl^c^c 

Z^Z^'' tggattactt cttacttagt tactaactca atgISIS^ 

AATCCCTACA GGATCTTTTT TTGCAAACAA CTCATATATG CAGACAAAtJ 
TTTGACAAAT TCACCTTTTA AACACGACGT TAACCGATTT CTG^SSJS 
TCTTTAGCTT ACATTTTAAA CATACACAAT AAACACTAAT cS^SS 
TTCACTGTTT TTATTAGTAT GAATATAAAA TTTGAAGGTT tZI^I 
CATGATATAA TCACAGCCTG CATACATATG CaSg^Sa 
GTTAGTGAGT TTGTCAAGCT TAATCTAATT GGTTAAGTCT AAAG^^Sa 
TTATTCCTTG ATGTTTGCTT TGTATTGGCT ACAAATGTGC AGAgJSIJa 
CATATGTGAT GTCGATGTCT CTGTCTTnT TrrTGTCTTT AAAAaI^JIJ 
l^'^'' TGTATTTGAA TAAAATGA7T TCTTAGTATG ATTG^^S^J 
AATGAATGAA AGTGGAACAT GTTTCTTTTT GAAAGGGAGA GAATTGACcI 
TTTATTGTTG TGATGnTAA OTTATAACTT ATTGAGCACT ^Zt^TG 
ATAACTGm- TTAAACTTGC CTAATACCTT TCTTGGGTAT TGTTTgS 
^tr^.^'Z TTTGTTTGTT TAAGTTGCTG CTtSggJJI 

ACAGCGTGTT TTAGAA6ATT TAAATTTCTT TCCTCTCTGC ACAATTaS 
ATTCAGAGCA AGAGGGCCTG ATTTTATAGA AGCCCcS^A ^IS^CC 
AGATGAGAGC AGAGATACAG TGAGAAATTA TGTGATCTGT ^^^o 
AAGAGAATTT TCAATATGTA ACTACGGAGC TGTAGTGCCA TTAGAaIctg 
TGAATTTCCA AATAAATCTG AACACTTGTC TTTATT ™«AAACTG 
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Figure 24 



QRGLPGREPL RSRSASAIAL RTIGHILALL LRLLHLGLGS GGCRBDVPPS 
GRGKKEEKMK KHRRALALVS CLFLCSLVWL PSWRVCCKES SSASASSYYS 
QDDNCALENE DVQFQKKNTE SKKLSPPWE TLPTVDLHEE SSNAWDSET 
VENISSSSTS EITPISKLDE lEKSGTIPIA KPSETEQSET DCDVGEALDA 
SAPIEQPSFV SPPDSLVGQH lENVSSSHGK GKITKSEFES KVSASEQGGG 
DPKSALNASD NLKNESSDYT KPGDIDPTSV ASPKDPEDIP TFDEWKKKVM 
EVEKEKSQSM HASSNGGSHA TKKVQKNRNN YASVECGAKI LAANPEAKST 
SAILIENMDL YMLNPCSTKI WFVIELCEPI QVKQLDIANY ELFSSTPKDF 
LVSISDRYPT NKWIKLGTFH GRDERNVQSF PLDEQMYAKY VKVELLSHFG 
SEHFCPLSLI RVFGTNMVBE YEEIADSQYH SERQELFDED YDYPLDYNTG 
EDKSSKNLLG SATNAILNMV NIAANILGAK TEDLTEGNKS ISENATATAA 
PKMPESTPVS TPVPSPEYVT TEVHTHDMEP STPDTPKESP IVQLVQEEEE 
EASPSTVTLL GSGEQEDESS PWFESETQIF CSELTTICCI SSFSEYIYKW 
CSVRVALYRQ RSRTALSKGK DYLVLAQPPL LLPAESVDVS VLQPLSGELE 
NTNIEREAET WLGDLSSSM HQDDLVNHTV DAVELEPSHS QTLSQSLLLD 
ITPEINPLPK lEVSESVBYE AGHIPSPVIP QESSVEIDNE TEQKSESFSS 
lEKPSITYET NKVNELMDNI IKEDMNSMQI FTKLSETIVP PINTATVPDN 
EDGEAKMNIA DTAKQTLISV VDSSSLPEVK EEEQSPEDAL LRGLQRTATD 
FYAELQNSTD LGYANGNLVH GSNQKESVPM RLNNRIKALE VNMSLSGRYL 
EELSQRYRKQ MEEMQKAFNK TIVKLQNTSR lAEEQDQRQT EAIQLLQAQL 
TNMTQLVSNL SATVAELKRE VSDRQSYLVI SLVLCWLGL MLCMQRCRNT 
SQFDGDYISK LPKSNQYPSP KRCFSSYDDM NLKRRTSFPL MRSKSLQLTG 
KEVDPNDLYI VEPLKFSPEK KKKRCKYKIE KIETIKPEEP LHPIANGDIK 
GRKPFTNQRD FSNMGEVYHS SYKGPPSEGS SETSSQSEES YFCGISACTS 
LCNGQSQKTK TEKRALKRRR SKVQDQGKLI KTLIQTKSGS LPSLHDIIKG 
NKEITVGTFG VTAVSGHLN -LNPSYRRLF CCCSLKNSL. YLKGLGEGEN 
INGKGIQKLW FLPF«KVDGI VLNLG«»ATV LQS»SLPIRT MVDIL.RCFF 
TRLITGTKVI WKPSSLGGIG MKA»TSSFSP VPISCTFPYL CAFCLFIMPL 
EEEG.LPLLF DFFYNFVRFL KLQTLQCFEG . VCA.SSGVWI RQSKDPKNLP 
TGSLFSKLTG NEHLMEFLSL FC»VDGDALV IFTYSGWITS YLVTNSMRKK 
SLQDLFLQTT DICRQIFDKF TF«TRR«PIC EGFL«LTF«T YTINTNPPNF 
HCFY.YEYKI .RFGQLVQVS .YNHSLHTYA QIQLVSLSSL LLVKSKEII 
IP-CLLCIGY KCAEVIHM'C RCLCLFFCL. KIIGSNCLI K^FLSMIVQ. 
•MKVEHVSF* KGEN-PPIW MPKL»LIEHP ••••LFLNLP NTFLGYCL-C 
DLFNAFFVCL SCCFRLTACP RRPKFLSCLH N.LPRARGPD FIEAP.REVQ 
MRAE1Q»BIM •SVCCGKRIF NM«LRSCSAI RNCEFPNKSE HLSL 
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Fiffure 25(A) 



TAGAATTCAG CGGCCGCTGA ATTCTAGCTG CGGGGTAGGA GTCCGCGGCA 
GCCTCCGGGT AAGCCAAGCG CCGCGCAGTG CTGAGTTCCC GCACGCCGCA 
GAGCCATGGA GATCGGCACC GAGACCAGCC GCAAGATCCG GAGTGCCATT 
AAGGGGAAAT TACAAGAATT AGGAGCTTAT GTTGATGAAG AACTTCCTGA 
TTACATTATG GTGATGGTGG CCAACAAGAA AAGTCAGGAC CAAATGACAG 
AGGATCTGTC CCTGTTTCTA GGGAACAACA CAATTCGATT CACCGTATGG 
CTTCATGGTG TATTAGATAA ACTTCGCTCT GTTACAACTG AACCCTCTAG 
TCTGAAGTCT TCTGATACCA ACATCTTTGA TAGTAACGTG CCTTCAAACA 
AGAACAATTT CAGTCGGGGA GATGAGAGGA GGCATGAAGC TGCAGTGCCA 
CCACTTGCCA TTCCTAGCGC GAGACCTGAA AAAAGAGATT CCAGAGTTTC 
TACAAGTTCG CAGGAGTCAA AAACCACAAA TGTCAGACAG ACTTACGATG 
ATGGAGCTGC AACCCGACTA ATGTCAACAG TGAAACCTTT GAGGGAGCCA 
GCACCCTCTG AAGATGTGAT TGATATTAAG CCAGAACCAG ATGATCTCAT 
TGACGAAGAC CTCAACTTTG TGCAGGAGAA TCCCTTATCT CAGAAAGAAC 
CTACAGTGAC ACTTACATAT GGTTCTTCTC GCCCTTCTAT TGAAATTTAT 
CGACCACCTG CAAGTAGAAA TGCAGATAGT GGTGTTCATT TAAACAGGTT 
GCAATTTCAA CAGCAGCAGA ATAGTATTCA TGCTGCCAAG CAGCTTGATA 
TGCAGAGTAG TTGGGTATAT GAAACAGGAC GTTTGTGTGA ACCAGAGGTG 
CTTAACAGCT TAGAAGAAAC GTATAGTCCG TTCTTTAGAA ACAACTCGGA 
GAAAATGAGT ATGGAGGATG AAAACTTTCG GAAGAGAAAG TTGCCTGTGG 
TAAGTTCAGT TGTTAAAGTA AAAAAATTCA ATCATGATGG AGAAGAGGAG 
GAAGfiAGATG ATGATTACGG GTCTCGAACA GGAAGCATCT CCAGCAGTGT 
GTCTGTGCCT GCAAAGCCTG AAAGGAGACC TTCTCTTCCA CCTTCTAAAC 
AAGCTAACAA GAATCTGATT TTGAAGGCTA TATCTGAAGC TCAAGAATCC 
GTAACAAAAA CAACTAACTA CTCTACAGTT CCACAGAAAC AGACACTTCC 
AGTTGCTCCC AGAACTCGAA CTTCTCAAGA AGAATTGCTA GCAGAAGTGG 
TCCAGGGACA AAGTAGGACC CCCAGAATAA GTCCCCCCAT TAAAGAAGAG 
GAAACAAAAG GAGATTCTGT AGAAAAAAAT CAAGCTGAGA TGAOTGAACT 
GAGTGTGGCA CAGAAACCAG AAAAACTTTT GGAGCGCTGC AAGTACTGGC 
CTGCTTGTAA AAATGGGGAT GAGTGTGCCT ACCATCACCC CATCTCACCC 
TGCAAAGCCT TCCCCAATTG TAAATTTGCT GAAAAATGTT TGTTTGTTCA 
CCCAAATTGT AAATATGATG CAAAGTGTAC TAAACCAGAT TGTCCCTTCA 
CTCATGTGAG TAGAAGAATT CCAGTACTGT CTCCAAAACC AGTTGCACCA 
CCAGCACCAC CTTCCAGTAG TCAGCTCTGC CGTTACTTCC CTGCTTGTAA 
GAAGATGGAA TGTCCCTTCT ATCATCCAAA ACATTGTAGG TTTAACACTC 
AATGTACAAG TCCGGACTGC ACATTCTACC ATCCCACCAT TAATGTCCCA 
CCACGACATG CCTTGAAATG GATTCGACCT CAAACCAGCG AATAGCACCC 
AGTCCTGCCT GGCAGAAGAT CATGCAGTTT GGAAGTTTTC ATGTACTGAT 
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Figwre 2g(B) 



GAAAGATACT CTACAGAACT TGTCAAATCT TTGAAACTTG GAATATATTG 
CTTTCATAAT ATGAAGTTTT ATTGCCTATC TATCTGAAGT GTCTAATTTT 
TCAAGTTTGT AAGTTTATTA TGTGGTTTTA ACATTGGGTG TTTTTGTTTT 
GTTTTTACTA TGAAAAGACA GCTTAAGGAA GAGCTAAATT CTGTTAAAAT 
ATTTGGGGCA TGTTTGTGCA CTGCTGTTGT GAGGATCAGC ATATGAAATT 
GACATCATGG TTAGTCATGG TACTGCAGCT TAGGGGGCTA CACGGTTGCT 
GTGTGAGTGG AGAGATGCAG TGAGGC7VGTT GTCATTATTC TAAAAATTGT 
ACTACTTTCA CTTTTCCCAA AGATTATATA ATGTTCATAA TCCACCATGA 
AAACAGCATT GGCCAAAGGT ACTGAGGCTG CTTAAAATAT TCAATTCTGC 
TTTTTAATTT TTAAGTGAAT TTAGTTTGAA AAGCATGATT ATACAGGCCT 
CTCAGGCTGA GTGCTACTTT CGGTAAAGTT CCAGTTTTCC TGCCTTCTGT 
GACAGGATGA ATGAGGTGGG TATGGACAGT GGAGGCAGCT GGAATGGCAA 
GTGCAGAAAA TAGGAACAGT TCTATACAGT GCTCTCATTT ACTAATAACA 
TAATGCCTTC TAAATAATTT TTTTGGGAAA CTACATTATC ACAAAATTAT 
ACAAATTTTT TTACAAGTAT TTACATACTG TATCTGAAAA CAGACTTTAA 
AGTCACAAGA TTATAAATGT ACATATGTAT TCTCACATTC TGAAAAATAA 
CATTCTCAGA ATCCACAGAA AATATACTTA GTTACTACTG AAGATAATTT 
TTGAAATGTA AAAATTAGAT TTAAATAGTA TATTTTAAAT GACAGAACTA 
TAATTACAGA GATCAGATCA GATAGGTAAA CTGCAAGATA GATAGGATGA 
AACTTTTGGC CTACTGTATT ACTTACAGAG TTTTTTTGTG TGTGGTTTTT 
AAAACTGTTA AGGCAAGAAG TGTCAAATGC TTTAGAGTTA AATAACAGAT 
CACTGATTTC AAAGACTTGG TGTATAGTGT TAAAAATTAA AGCTTAAAAG 
GTGGTTAGAA AAGTGGATTA ATGCAAAAGG GGTAATAAAG ACTGCAACAT 
TCTCAGGACC AAATTAAACT GCTAA 
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Figure 2fi 



•NSAAAEF»L RGRSPRQPPG KPSAAQC«VP ARRRAMEIGT ETSRKIRSAI 
KGKLQELGAY VDEELPDYIM VMVANKKSQD QMTEDLSLFL GNNTIRFTVW 
LHGVLDKLRS VTTEPSSLKS SDTNIFDSNV PSNKNNFSRG DERRHEAAVP 
PLAIPSARPE KRDSRVSTSS QESKTTNVRQ TYDDGAATRL MSTVKPLREP 
APSEDVIDIK PEPDDLIDED LNFVQENPLS QKBPTVTLTY GSSRPSIEIY 
RPPASRNADS GVHLNRLQFQ QQQNSIHAAK QLDMQSSWVY ETGRLCEPEV 
LNSLEBTYSP FFRNNSEKMS MEDENFRKRK LPWSSWKV KKFNHDGEEE 
EfSDDDYGSRT GSISSSVSVP AKPERRPSLP PSKQANKNLI LKAISEAQES 
VTKTTNYSTV PQKQTLPVAP RTRTSQEELL AEWQGQSRT PRISPPIKEE 
ETKGDSVEKN QAEMSELSVA QKPEKLLERC KYWPACKNGD ECAYHHPISP 
CKAFPNCKFA EKCLFVHPNC KYDAKCTKPD CPFTHVSRRI PVLSPKPVAP 
PAPPSSSQLC RYFPACKKME CPFYHPKHCR FNTQCTSPDC TFYHPTINVP 
PRHALKWIRP QTSE»HPVLP GRRSCSLEVF MY»»KILYRT CQIFETWNIL 
LS»YEVLLPI YLKCLIFQVC KFIMWP»HWV FLFCFYYEKT A»GRAKFC»N 
IWGMFVHCCC EDQHMKLTSW LVMVLQLRGL HGCCVSGBMQ •GSCHYSKNC 
TTFTFPKDYI MFIIHHENSI GQRY»GCLKY SILLFNP^VN LV-KA-LYRP 
LRLSATFGKV PVPLPSVTG* MRWVWTVEAA GMASAENRNS SIQCSHLLIT 
•CLLNNPFGK LHYHKIIQIF LQVFTYCI.K QTLKSQDYKC TYVFSHSEK* 
HSQNPQKIYL VTTEDNF-NV KIRFK-YILN DRTIITEIRS DR«TAR«IG« 
NFWPTVIiLTE FFCVWFLKLL RQEVSNALEL NNRSLISKTW CIVLKIKA-K 
VVRKVD«CKR GNKDCNILRT KLNC*" 



MEIGT ETSRKIRSAI 

KGKLQELGAY VDEELPDYIM VMVANKKSQD QMTEDLSLFL GNNTIRFTVW 
LHGVLDKLRS VTTEPSSLKS SDTNIFDSNV PSNKNNFSRG DERRHEAAVP 
PLAIPSARPE KRDSRVSTSS QESKTTNVRQ TYDDGAATRL MSTVKPLREP 
APSEDVIDIK PEPDDLIDED LNFVQENPLS QKEPTVTLTY GSSRPSIEIY 
RPPASRNADS GVHLNRLQFQ QQQNSIHAAK QLDMQSSWVY ETGRLCEPEV 
LNSLEBTYSP FFRNNSEKMS MEDENFRKRK LPWSSWKV KKFNHDGEEE 
E6DDDYGSRT GSISSSVSVP AKPERRPSLP PSKQANKNLI LKAISEAQES 
VTKTTNYSTV PQKQTLPVAP RTRTSQEELL AEWQGQSRT PRISPPIKEE 
ETKGDSVEKN QAEMSELSVA QKPEKLLERC KYWPACKNGD ECAYHHPISP 
CKAFPNCKFA EKCLFVHPNC KYDAKCTKPD CPFTHVSRRI PVLSPKPVAP 
PAPPSSSQLC RYFPACKKME CPFYHPKHCR FNTQCTSPDC TFYHPTINVP 
PRHALKWIRP QTSE 
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